Stuff The Internet Says On Scalability For October 18th, 2013

Hey, it's HighScalability time:


Test your sense of scale. Is this image of something microscopic or macroscopic? Find out.

  • $3.5 million: Per Episode Cost of Breaking Bad
  • Quotable Quotes:
    • @GammaCounter: "There are 400 billion trees in the Amazon River basin, close to the number of stars in the Milky Way galaxy." 
    • @rbranson: Virtualization has near-zero overhead, unless the VM spends most of it's time copying between RAM and network… like memcached or haproxy.
    • @HackerNewsOnion: Programming is 1% inspiration, 99% trying to get your environment working.
    • @aneel: "roundtrips, not bandwidth, is now often the bottleneck for most applications"
    • @jamesurquhart: Not to mention the fact that auto-scaling should happen above IaaS layer. Think multi-cloud.
    • Sheref Mansy: A machine keeps sort of chugging away, without worrying about its environment. But a living system has to.
    • V.D. Veksler: it just came to my attention that Javascript v8 is faster than Python. I could not believe it, thought it might just be CPython.
    • Doron Rajwan: For the past 30 years, computer performance has been driven by Moore’s Law; from now on, it will be driven by Amdahl’s Law.
    • Bjarne Stroustrup: There are only two kinds of languages: the ones people complain about and the ones nobody uses.
  • Steve Souders and John Allspaw, the Laurel and Laurel of the DevPerfOps world had a really good interview at Velocity. Some trends...mobile is huge; there's now a big focus on rendering performance; institutionalizing failure - planning and doing something with failure, failure is a friend, not a scary monster; don't panic, when there's a problem figure out what's going on first; Humans and Machines are buds, they are cooperative, not John Henry like adversaries.
  • Forget the history of Kings and peoples. Here's a far more interesting history, the History of Packets. Cool look at how TCP has changed over time, a description of how packets work, and the history of the Internets. Kind of boring as there are no beheadings.
  • If you are yearning for a feeling of nostalgia then take an empathetic dip into Redecentralize.org whose mision is to bring back the Internet to how it was, a decentralized commons uniting brave diginaughts. A wonderful vision, but you can never go home again.
  • Algorithms and Me is a new site with nice clear explanations with code of lots of common algorithms. Good job Jitendra Sangar. 
  • You know where those great volumes of useless unviewed images go when they die? Purgatory. Or the digital version at least: First Look: Facebook’s Oregon Cold Storage Facility. Do we really need to keep these? Can these poor souls move on or must they remain ghosts of our haunted past?
  • StorageMojo on Extending block storage to cloud scale: The architectural implications of cheap I/O continues to unfold. In combination with the benefits of massive scale – where reads/watt makes great sense – the opportunities to build efficient commodity-based infrastructures seem to multiply.
  • Doug Lea with a fascinating look at something we give little thought to, the Java Memory Model. The JSR-133 Cookbook for Compiler Writers. Amazing and intricate detail. This is the story behind the story.
  • Twilio on Getting the most out o HAProxy. It's not always the tool but how you configure the tool that's important. Twillio as they move to service oriented architecture shows you their setting for getting the most of HAProxy, which includes useful log line helpful for debugging, persistent connections with load balancing, insight on Health Check.
  • Funny Dilbert cartoon.
  • Do we need meds for this? People seem to hate Java, but we always seem to come back to it. I can just feel the self loathing seething under the surface.
  • Etsy with a brutallly honest data driven assessment of their site's performance. An excellent exercise for all companies to engage in regularly. Font front-end performance is slipping as image and javascript download sizes continue to increase, which is an Internet wide page weight obesity trend.
  • A good experiment. Performance Testing on Dedicated Hardware: We seek to prove the hypothesis that the Netty stack (Zuul-Netty) is able to produce a more stable, higher-throughput and lower-latency performance characteristic than the original Neftlix Zuul implementation (Zuul-Tomcat). We posit that significant gains will be achieved by utilising non-blocking outbound IO and zero-copy buffer transfer.
  • A lot of good discussion about how different functionality gets sliced up in web applications. API first architecture or the fat vs thin server debate
  • Neil Gunther's book of capacity planning aphorisms gathered into a Guerrilla Manual Online. Quite a large scope of topics, with opinions in the practice of best practices, when wrong is right, network performance, little's law, art vs science, tyranny of the 9s, the universal scalability law, and new entry on Data Science: Anything that has to call itself a "science," usually isn't. (Social Science?) How about Information Science?
  • No More Callbacks: 10,000 Actors, 10,000 Threads, 10,000 Spaceships. Pair the quite intutitive notion that every thing parallel should run in its own parallelization abstraction and with actors and channels and you get Quasar. The conclusion is the: combination of a concurrent, parallelizing database with an actor system on top of lightweight threads gives us several huge advantages. Good Hacker News discussion.
  • Interesting. Multiple readers and writers concurrency tests: Various lock-based and lock-free algorithms tested for multiple readers and writers scenarios. The purpose is to show the performance of different approaches.
  • Well explained and something worth understanding. Why Registers Are Fast and RAM Is Slow. Factors are distance and cost. RAM is slow because there's a lot of it which means it's cheaper which means it's slower.
  • Can't agree here. Code Generation Seems Like a Failure of Vision. Code generation is the ribosome to the DNA. Without a meta interpretive layer you are simply a dumb bunch of chemicals linked together.
  • Distributed Optimistic Concurrency Considered Optimistic: We conclude that current TM benchmarks are not appropriate workloads for a distributed system using optimistic concurrency.