Stuff The Internet Says On Scalability For January 23rd, 2015

Hey, it's HighScalability time:


Elon Musk: The universe is really, really big  [Gigapixels of Andromeda [4K]]

  • 90: is the new 50 for woman designer; $656.8 million: 3 months of Uber payouts; $10 billion: all it takes to build the Internet in space; 1 billion: registered WeChat users
  • Quotable Quotes:
    • @antirez: Tech stacks, more replaceable than ever: hardware is better, startups get $$ (few nodes + or - who cares), alternatives countless.
    • Olivio Sarikas: If every Star in this Image was a 2 millimeter Sandcorn you would end up with 1110 kg of Sand!!!!!!!!!
    • Chad Cipoletti: In even simpler terms, we see brands as people.
    • @timoreilly: Love it: “We need a stack, not a pile” says @michalmigurski.
    • @neha: I would be very happy to never again see a distributed systems paper eval on a workload that would fit on one machine.
    • @etherealmind: OH: "oh yeah, the extra 4 PB of storage is being installed today. Its about 4 racks of gear".
    • @lintool: Andrew Moore: Google's ecommerce platform ingests 100K-200K events per second continuously. 

  • Programming as myth building. Myths to Live By: The true symbol does not merely point to something else. It contains in itself a structure which awakens our consciousness to a new awareness of the inner meaning of life and of reality itself. A true symbol takes us to the center of the circle, not to another point on the circumference.

  • Not shocking at all: "We found the majority of catastrophic failures could easily have been prevented by performing simple testing on error handling code...A majority (77%) of the failures require more than one input event to manifest, but most of the failures(90%) require no more than 3." Really, who has the time? More on human nature in Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems.

  • Let simplicity fail before climbing the complexity ladder. Scalability! But at what COST?: "Big data systems may scale well, but this can often be just because they introduce a lot of overhead. Rather than making your computation go faster, the systems introduce substantial overheads which can require large compute clusters just to bring under control. In many cases, you’d be better off running the same computation on your laptop." But notice the kicker: "it took some work for parallel union-find." Replacing smart work with brute force is often the greater win. What are a few machine cycles between friends?

  • Programming is the ultimate team sport, so Why are Some Teams Smarter Than Others? The smartest teams were distinguished by three characteristics. First, their members contributed more equally to the team’s discussions. Second, their members can better read complex emotional states. Third, teams with more women outperformed teams with more men.

  • WhatsApp doesn't understand the web. Interesting design and discussions. Using proprietary Chrome APIs is a tough call, but this is more perplexing: "Your phone needs to stay connected to the internet for our web client to work." Is this for consistency reasons? To make sure the phone and the web stay in sync? Is it for monetization reasons? It does create a closed proxy that effectively prevents monetization leaks. It's tough to judge a solution without understanding the requirements, but there must be something compelling to impose so many limitations.

  • Does the recent popularity of machine learning mean there's a "rise of the scientific programmer"? Ali Kheyrollahi seems to think so. Future of Programming - Rise of the Scientific Programmer (and fall of the craftsman). Not so sure. Haven't we always had crisp clean algorithms wrapped in the riddle that is the enigma that is a complete software system? 

  • The Tail at Scale: Just as fault-tolerant computing aims to create a reliable whole out of less-reliable parts, large online services need to create a predictably responsive whole out of less-predictable parts; we refer to such systems as “latency tail-tolerant,” or simply “tail-tolerant.” The techniques:  Hedged requests, Tied requests, Micro-partition,  Selectively increase replication factors,  Put slow machines on probation, Consider ‘good enough’ responses, Use canary requests.

  • Roman Leventov analysis of Redis data structures. In which Salvatore 'antirez' Sanfilippo addresses point by point criticisms of Redis' implementation. People love Redis, part of that love has to come from what a good guy antirez is. Here he doesn't go all black diamond alpha nerd in the face of a challenge. He admits where things can be improved. He explains design decisions in detail. He advances the discussion with grace, humility, and smarts. A worthy model to emulate.

  • A very thorough introduction to Espresso: LinkedIn's online, distributed, fault-tolerant NoSQL database that currently powers approximately 30 LinkedIn applications including Member Profile, InMail, portions of the Homepage and mobile applications, etc. An impressive set of features. 

  • A Simple Performance Comparison of HTTPS, SPDY and HTTP/2: HTTP/2 is likely to provide significant performance advantages compared to raw HTTPS and even SPDY. 

  • SQL, Scaling, and What's Unique About Postgres: PostgreSQL's extensible architecture puts it in a unique place for scaling out SQL and also for adapting to evolving hardware trends. It could just be that the monolithic SQL database is dying. If so, long live Postgres!

  • IBM took the paper down, but this article gives a taste: IBM Reveals Proof of Concept for Blockchain-Powered Internet of Things. I can't see putting a paid toll road between all the devices in the world, but this sounds like a good thing: All this is achieved without a central controller orchestrating or mediating between these devices.

  • Mark Smith with an excellent midlevel tutorial on Building Services in Go

  • Such a simple thing. Starting Off on the Wrong Foot: The “SpeedIndex Tax” of Mobile Redirects: This is a simplified analysis, but from this data, it appears that nearly 50% of all mobile sites kick off the critical rendering path with a 301 or 302 redirect. These sites are generally smaller (based on KB and requests), but the SpeedIndex of these sites is ~2200-3000 higher when compared to similarly sized sites without a redirect. 

  • Yes he does. Robert Sapolsky Rocks. If you have not yet had the pleasure of reading Sapolsky then I'm jealous of what you have in store. He's one brilliant dude. Even if you've read Sapolsky before, there's likely something new for you too.

  • It always comes down to risk management. Why you should not “store” your sessions in memcached. Like this by cratermoon: It really depends on the use case for the sessions and expectations. At some point the user's session will, and SHOULD, timeout anyway. Proper sizing and configuration of memcache and resilient interfaces will reduce the chance of random session loss.

  • A great general introduction, Using Containers to Build a Microservices Architecture, but I would like to see more specifics on the architecture implications, we are missing that part of the story.

  • EXTREME OPENSTACK: SCALE TESTING OPENSTACK MESSAGING: So in summary, RabbitMQ still remains the de facto choice for messaging in an Ubuntu OpenStack Cloud; it scales vertically very well – add more CPU and memory to your server and you can deal with a larger cloud – and benefits from fast storage.

  • Tom's Hardware performance tests USB 3.1: enables sequential reads in excess of 700MB/s...Write performance isn’t far behind...Random I/O isn’t nearly as impressive...You’ll see reads approaching 7400 IOPS in 4KB reads at a queue depth of one.

  • A Unifying Theory for Scaling Laws of Human PopulationsThe spatial distribution of people exhibits clustering across a wide range of scales, from household (10^-2m) to continental (10^km) scales. Empirical data indicates simple power-law scalings for the size distribution of cities (known as Zipf’s law), the geographic distribution of friends, and the population density fluctuations as a function of scale. We derive a simple statistical model that explains all of these scaling laws based on a single unifying principle involving the random spatial growth of clusters of people on all scales. The model makes important new predictions for the spread of diseases and other social phenomena.