Stuff The Internet Says On Scalability For October 2nd, 2015

Hey, it's HighScalability time:


Elon Musk's presentation of the Tesla Model X had more in common with a new iPhone event than a traditional car demo.
If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.

  • 1.4 billion: Android devices; 1000: # of qubits in Google's new quantum computer; 150Gbps: Linux botnet DDoS attack; 3,000: iPhones sold per minute; smith: the most common last name in the US; 50%: storage reduction by using erasure coding in Hadoop; 101: calories burned during sex.

  • Quotable Quotes:
    • @peterseibel: How to be a 10x engineer: help ten other engineers be twice as good.
    • The Master Algorithm: Scientists make theories, and engineers make devices. Computer scientists make algorithms, which are both theories and devices
    • @immolations: Feudalism may not be perfect but it's the best system we've got. More of us have chainmail today than at any point in history
    • @mjpt777: We managed to transfer almost 10 GB/s worth of 1000 byte messages via Aeron IPC. That's more than a 100GigE network. Way to scale up on box!
    • @caitie: lol what my services do 1.5 billion writes per minute ~25 million writes per second
    • @mjpt777: Think of your QPI links in a multi-socket server as a fast network. Communicate to share memory; don't share memory to communicate.
    • @aalmiray: "you can't have a second CPU until you prove you can use the first one" - @mjpt777
    • Periscope: a hard drive is over 3x faster a than gigabit ethernet
    • thom: Any sufficiently complicated distributed architecture contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of SOAP.
    • @dabeaz: Instead of teaching everyone how to code, I wish we'd just focus on getting everyone's curiosity from kindergarten back.
    • Matthew Jones: It's a Catch-22. We need the metrics to choose the best architecture, but we need to actually implement the damn thing in order to get metrics, and implementation requires us to select an architecture. 
    • @jmwind: Today we built Shopify 500 times, deployed to prod 22 times, peaked at 700 build agents, spun 50k docker containers in test and 25k prod.
    • antirez: Redis, especially using pipelining, can serve an impressive amount of requests per second per thread (half a million is a common figure with very intensive pipelining. Without pipelining it is around 100,000 ops/sec). 
    • @jcox92: This is my invitation to you to start using languages that were discovered rather than languages that were invented." #strangeloop
    • @tyler_treat: "Measuring latency at saturation is like looking at your bumper after wrapping your car around a pole." —@giltene
    • Fazal Majid: Caching, the part memcached is designed for, is very simple, it's cache invalidation that is hard, and memcached is not fit for purpose there.
    • Alex Chernyakhovsky: Just as Bitcoin provided the first decentralized append-only database supporting transactions, there are other novel systems and solutions that have interesting properties. We should work to extend Bitcoin with those advances as they mature.
    • DHH: Yes, investing in growth when you got a good thing going is smart. But so is thinking that you might currently be enjoying the very best years of the business, not just “the beginning of an amazing journey”.
    • Camille Fournier: The amount of overhead that goes into managing coordination of people cannot be overstated. It's so great that the industry invented microservices because we'd rather invest engineering headcount and dollars into software orchestration than force disparate engineering teams to work together.
    • Camille Fournier: If we got some number of features done this year with our current engineering staff, we will need ~20% more engineers next year to get the same number of features done.
    • @nicksieger: Coordinated systems with locking decrease throughput by 400x — Peter Bailis @ #strangeloop. Yes, distributed transactions are that bad.
    • @bytemeorg: "Holding locks is slow. In 1976, the disk was the slow thing. Today, the network is the slow thing." — @pbailis #strangeloop

  • Another example of the diffusion of the software ethos. Elon Musk's presentation of the Tesla Model X had more in common with a new iPhone event than a traditional car demo. First, it was a livecast that started a touch late. Second, throngs of fanpeople clapped and whooped in all the appropriate places. Gone are the beauty shots of cars simply meant to stroke the lizard brain. Elon hit the use cases. He talked vision statement. He talked safety specs and features. He talked air quality in depth. He didn't wait for iFixit to do a tear down, he showed construction details and how they reinforced features and quality. He showed how the Falcon Wing door auto opened and closed; how the doors worked in a crowded parking lot; and how the door design also allowed passengers to easily access the third row of seats. This focus on the car as an engineered product for solving tangible problems in real life may be the lasting legacy of Tesla. 

  • Tools are to programmers like shoes are to the mundane fashion world. Which is what makes this discussion of Why Fogbugz lost to Jira in the bug tool wars so fascinating. In one corner we have gecko with a nice analysis of the FogBugz side and we have carlfish with a quality response from the Atlassian perspective. It's painful to remember how convoluted product deployment was before software as a service. 

  • How does the CIA provide advanced state-of-the-art analytics? On Amazon of course. Amazon birthed the CIA their own region in 9 months. The CIA decided the only way to reach commercial parity was to to stop trying to do it themselves and leverage those who already know how to do it. The CIA will have its own private version of the marketplace so they can transition tools as fast as possible into the hands of analysts. The CIA really likes themselves some Spark. Partnering for expertise is something the CIA is trying to learn how to do. Oh, the CIA is hiring. 

  • Jeff Atwood has the sense of this. Learning to code is overrated: An accomplished programmer would rather his kids learn to read and reason. One caveat is understanding algorithms will be a necessary life skill now and certainly in the future. We'll need to see algorithms for what they are, biased tools that serve someone else's purpose. It's common even among the learned today to see algorithms as objective and benign. The easiest way of piercing the algorithm washing vale may be for people to learn a little programming. That may help demystify what's really going on.

  • Embrace, extend and extinguish. Amazon Will Ban Sale of Apple, Google Video-Streaming Devices. This kind of cross division strategy tax often marks the beginning of the end. Amazon is no longer an everything store. Once we begin to not think of going to Amazon First when shopping then we may transition to Amazon Maybe and then to Amazon Never. 

  • So building websites is OK again? Mobile browser traffic is 2X bigger than app traffic, and growing faster. Who should we believe? App. Web. App. Web.

  • What Makes Containers At Scale So Difficult? Problems like determining how many containers will fit on a VM; tracking the health of containers; finding particular containers; and upgrades. Mesos uses a two-level scheduling approach that yields linear scalability. 

  • Videos from the Strange Loop Conference are available on YouTube. Many interesting talks.

  • Why should you choose the Google Cloud Platform? Google makes their case: It supports Live Migration; 0-1m+ scaling load balancers; 45 second instance boot times; 680,000 IOPS sustained Local SSD read rate; 3 seconds to archive restore; it costs less. frakkingcylons likes: "One GCP offering that I think deserves more attention is Managed VMs (part of App Engine). You provide your app + runtime in a Docker container, and App Engine will run that container in a Compute Engine VM and do a bunch of cool stuff for like health checking, autoscaling, and log aggregation." jread finds GCP less reliable. benburton likes Nearline storage. cominatchu thinks Google Container Engine is clear winner over Amazon EC2 Container Service. Most people think service sucks in general.

  • The ability to toggle features in production is a common capability these days. Here's how Instagram uses their Gate Logic service "to control feature rollout in a safe and flexible way, with no performance loss." A nice write-up on a thoughtful solution. The gate logic is written in a Domain Specific Language in the form of an embedded compiled and typed language in Python. Zookeeper is used to distribute the rules.The rules are compiled down to native Python bytecode. Techniques like inlining, avoiding eval, and factoring out redundant operations are used to increased performance.

  • Is this a sign of the world working or failing? A New XPrize Tries to Turn CO2 Into Cold, Hard Cash. Maybe both. 

  • How Distributed Systems Respond to Degraded Hardware. Partial failures are the hardest faults to deal with. If something fails completely you usually know, but if something fails a little the results can be both dramatic and hard to detect. For example, at Facebook a slow node brought "the job completion rate slowed down from 172 jobs per hour to  1 job per hour." There are three failure modes: There’s operation limplock, when an operation is slow because some subpart of the operation is slow (e.g., a disk read is slow because the disk is degraded), node limplock, when a node is slow even for seemingly unrelated operations (e.g, a read from RAM is slow because a disk is degraded), and cluster limplock, where the entire cluster is slow (e.g., a single degraded disk makes an entire 1000 machine cluster slow).

  • It's markets all the way down 1. Netflix with a fascinating idea: Creating Your Own EC2 Spot Market: Currently over 15% of our EC2 footprint autoscales, and the majority of this usage is covered by reserved instances as we value the pricing and capacity benefits. The combination of these two factors have created an “internal spot market” that has a daily peak of over 12,000 unused instances. We have been steadily working on building an automated system that allows us to effectively utilize these troughs.


  • A riveting tale from the building of the Silk Road. These Are the Two Forgotten Architects of Silk Road.

  • Papers from the 25th ACM Symposium on Operating Systems Principles are available.

  • Netflix created Flux, a super cool way to visualize system health. The video looks a lot like a network of transporter streams. What it is: we decided to take advantage of the brain’s ability to process massive amounts of visual information in multiple dimensions, in parallel, visually. We call this tool Flux. In the home screen of Flux, we get a representation of all traffic coming into Netflix from the Internet, and being directed to one of our three AWS Regions. 

  • All optical computing may eventually see the light of day. New Memory Chips Store Data Not with Electricity, but with Light: "The basis of the technology is a so-called phase-change material. Light pulses can be used to switch the material between two distinct states—one in which the atoms are ordered, or crystalline, and one in which they are disordered, or amorphous. The researchers exploited this phenomenon to write and read information." And if light doesn't work, there's always carbon nanotubes.

  • When AWS went down did Netflix go down too? Of course not.

  • What is possible ways to prevent Node.js server from crashing when having too many active users on the site? Profiling is in order, but Edward Viaene gallantly answers~ track down memory leaks,  identify synchronous code, increase your nodejs worker processes.

  • A nice straightforward article on using pprof to optimize a Go program. A Pattern for Optimizing Go.

  • The Master Algorithm: Each of the five tribes of machine learning has its own master algorithm, a general-purpose learner that you can in principle use to discover knowledge from data in any domain. The symbolists’ master algorithm is inverse deduction, the connectionists’ is backpropagation, the evolutionaries’ is genetic programming, the Bayesians’ is Bayesian inference, and the analogizers’ is the support vector machine.

  • A collection of 17 free programming books for your perusal.

  • Networks are the limiting factor in executing analytics database queries. Understanding Distributed Analytics Databases, Part 1: Query Strategies shows how using "approximate count" is a huge win. The article goes into great detail about how HyperLogLog probabilistic counters can skip inter-node data transfers, support bucketing, and increase parallelization.

  • Do we form enclaves around tools/apps/brands/boards for the same reason we form neighborhoods?: The underlying, ultimate cause of neighborhood formation is that people in cities need, or want, to live their lives on a smaller scale than the entire city.

  • Good summary: 2015 Container Summit notes and learnings, Part 1 / Part 2

  • A straightforward example of using Cassandra to build a robust, linear scalable inventory management system using distributed systems. Scalable Inventory. Also, Scale it to Billions — What They Don’t Tell you in the Cassandra README

  • Building the Big Data Superhighway: One of the innovations we’ve created is called Flash I/O Network Appliances (FIONA) to answer the question: how do you effectively terminate a 10-gigabit/second optical fiber. Say I bring a 10-gigabit/sec optical fiber into your lab and say, ‘Well, here you go.’ The good news is this fiber will bring 1,000 times as much data per second into your lab as you were doing with the shared Internet. That’s the good news. The bad news is you’ve got 1,000-times as much data coming into your lab; 

  • Spinnaker: enables end-to-end global Continuous Delivery at Netflix: repeatable automated deployments captured as flexible pipelines and configurable pipeline stages; provide a global view across all the environments; offer programmatic configuration and execution via a consistent and reliable API; easy to configure, maintain, and extend; operationally resilient; facilitate innovation within its umbrella.

  • logswan:  fast Web log analyzer using probabilistic data structures. It is targeted at very large log files, typically APIs logs. It has constant memory usage regardless of the log file size, and takes approximatively 4MB of RAM.

  • Swarm: an isomorphic reactive M-of-MVC library that synchronizes objects in real-time and may work offline. Swarm is perfect for implementing collaboration or continuity features in Web and mobile apps. Swarm supports complex data types by relying on its op-based CRDT base.

  • Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition: In this paper, we present techniques that further improve performance of LSTM RNN acoustic models for large vocabulary speech recognition. We show that frame stacking and reduced frame rate lead to more accurate models and faster decoding. CD phone modeling leads to further improvements. We also present initial results for LSTM RNN models outputting words directly.

  • Kudu: Storage for Fast Analytics on Fast Data: Kudu is an open source storage engine for structured data which supports low-latency random access together with efficient analytical access patterns. Kudu distributes data using horizontal partitioning and replicates each partition using Raft consensus, providing low mean-time-to-recovery and low tail latencies.

  • Gryffin: a large scale web security scanning platform. It is not yet another scanner. It was written to solve two specific problems with existing scanners: coverage and scale.

  • Another technique from the real-time embedded world makes it into the mainstream. Fast Database Restarts at Facebook: In this paper, we show that using shared memory provides a simple, effective, fast, solution to upgrading servers. Our key observation is that we can decouple the memory lifetime from the process lifetime. When we shutdown a server for a planned upgrade, we know that the memory state is valid (unlike when a server shuts down unexpectedly). We can therefore use shared memory to preserve memory state from the old server process to the new process.