Stuff The Internet Says On Scalability For March 21st, 2014

Hey, it's HighScalability time:


Isaac Newton's College Notebook, such a slacker.

  • Quotable Quotes:
    • Chris Anderson: Petabytes allow us to say: ‘Correlation is enough.’
    • @adron: Back to writing the micro-services for my composite app with regionally distributed, highly available, key value crypto stores of unicorns!
    • DevOps Cafe: when the canary dies you don't buy a stronger canary.
    • Mark Pagel: Creativity, like evolution, is merely a series of thefts.
    • GitHub: Whoever has more capacity wins.
    • The Master: We balance probabilities and choose the most likely. It is the scientific use of the imagination.
    • Jonathan Ive: Steve, and I don’t recognize my friend in much of it. Yes, he had a surgically precise opinion. Yes, it could sting. Yes, he constantly questioned. ‘Is this good enough? Is this right?’ but he was so clever. His ideas were bold and magnificent. They could suck the air from the room. And when the ideas didn’t come, he decided to believe we would eventually make something great. And, oh, the joy of getting there!”
    • @joestump: Turns out the correct amount of RAM is always 16GB more than you have.

  • From a spec of sand does a pearl grow. In this case the oyster is Facebook, the irritant is PHP, and the pearl is Hack, a new improved version of PHP developed by Facebook. Hack now runs all of Facebook's front-end code. Lots of great comments on Hacker News and on Reddit with much lang on PHP hate. Realize Facebook is not trying to build the next perfect language. Historically they coded their web tier in PHP so they have a lot of code working in production. It would be insane to throw that all away. Over the years they've put a lot of work into making PHP more efficient. Hack continues that work by making PHP a better language, coupling PHP's convenient interpreted nature with static typing and many other features commonly found in modern programming languages. If PHP bothers you, Hack is not for you, but that's OK.

  • Dish Jump-Starts the Hopper With 8 Tuners and PS4 Support. Ultimate in local caching. Record all the things and then watch what you want later. Streaming access sucks. More local storage!

  • Cassandra Hits One Million Writes Per Second on Google Compute Engine on 300 VMs with 300 1TB Persistent Disk volumes, 15,000 clients, median latency of 10.3 ms and 95% completing under 23 ms. It took 1 hour and 10 minutes at a cost of $330 USD. All benchmark caveats apply. They wrote small records of 170 bytes. What happens when you write a large range of data sizes? When you are reading? Are you running queries? Are you backing up? Are indexes being updated? Is there contention? Etc, etc. Still, it's impressive.

  • Are you tired of layer on top of layer of almost working frameworks? Do you want simplicity? Do you crave clarity? F*cking Shell Scripts may be just the cleansing breath you need.

  • Go is taking off: a seemingly very minor player, is already used nearly one tenth as much in FOSS as the most popular languages in existence.

  • GitHub with a superb post on how they handle Denial of Service Attacks. DoS attacks are categorized as either volumetric, which try to exhaust some resource, or complex, which use complex operations to exhaust resources. Solutions: more capacity, use a mitigation service provider, harden all parts of your infrastructure, buy appropriate and hardware and software to help fend off the attacks.

  • When Peter Lawrey gets a new machine he gets an understanding of its limitations by running tests. Probably something everyone should consider. Like when buying a new car you floor it and see what happens. One test he runs is for Micro jitter, busy waiting and binding CPUs.  The result for his new 6 core 4.5GHz i7-397, 32 GB of PC-1600 memory, running Ubuntu 13.04: Using thread affinity, without isolating the CPU doesn't appear to help much on this system. Where affinity and isolation helps, it may still make sense to busy wait as it appears the scheduler will interrupt the thread less often if you do. < Great stuff.

  • Dewey Paciaffi writes the touching story of One Man’s Perspective on the Growth of Unix and it's Derivatives: This pretty much sums up one mans subjective take on Unix over the years. How an OS that was cobbled together to play a game became the OS whose derivatives run nearly every smartphone, phone system, business system and web server in the world.

  • Let's say you want to move to WebP formatted images to save bandwidth. Should you do it all at once or incrementally with human curation to verify the images are still quality? Netflix: [we] re-encoded hundreds of thousands if not a million images in WebP without hand curation with success. Our customers seem pretty happy.

  • Multiple networks or one network? Our brains use multiple networks. Human brains 'hard-wired' to link what we see with what we do: Your brain's ability to instantly link what you see with what you do is down to a dedicated information 'highway', suggests new UCL-led research.

  • Using Redis as an LRU cache. Indepth explanation of often needed feature for caches. LRU is "a model to predict how likely a given key will be accessed in the future."

  • Maybe the System Isn't the Solution - Maybe It's the Problem: If you fix one problem, the total number of problems in the universe is not reduced. Because anergy must be conserved, other problems must be spontaneously created as a result of your actions. There is no end to it.

  • Ilya Grigorik has created a fun lesson using TEDed: Brief History of Latency: Electric Telegraph. The telegraph has so many parallels to the internet. Ilya highlights networking congestion, which was caused by routing and queuing delays. Interesting approach.

  • Good talk on creating a High Performance Audio system on Android. The usual suspects: Mutexes and Priority Inversion, Kernel Scheduling, Logging causing slowdowns, Ring Buffers. History does indeed repeat itself.

  • Nice explanation of making the difficult choice of Cassandra, MySQL, RDS, or DynamoDB or the life cycle of project. Moving from MySQL to RDS means lower admin overhead. But RDS has some problems so when you get large enough and have the skill in-house you move back into your MySQL setup. Cassandra is great at writes and doesn't have the vendor lockin of DynamoDB. But has a system gets larger the SaaS aspects of DynamoDB become more attractive. It's all tradeoffs.