Stuff The Internet Says On Scalability For May 3, 2013

Hey, it's HighScalability time:


(Giant Hurricane on Saturn, here's one in New Orleans)

  • 1,966,080 cores: Time Warp synchronization protocol using up to 7.8M MPI tasks on 1,966,080 cores of the {Sequoia} Blue Gene/Q supercomputer system. 33 trillion events processed in 65 seconds yielding a peak event-rate in excess of 504 billion events/second using 120 racks of Sequoia.
  • Quotable Quotes:
    • Thad Starner: the longer accessing a device exceeds 2s, the more its actually usage would decrease exponentially. Thus, he made a claim that wrist watch interface always sitting on one's wrist ready to use should be more successful than mobile phones which have to pulled out of the pocket. 
    • @joedevon: We came for scalability but we stayed for agility #NoSQL
    • @jahmailay: "Our user base is exploding. I really wish we spent more time on scalability instead of features customers don't use." - Everybody, always.
    • @bsletten: I don’t think it is a coincidence that the words eval() and evil are so close.
    • @RCSecure: Maybe Gov should stop deploying crappy #CyberSecurity instead of Surveiling Citizens
    • @davidpav: "This is what Netflix does - after each deployment creates AMI for faster scaling up"
    • @franzgranlund: Rewrote my little batch-processing application using #akka . 20% performance increase just like that - and now it is easier to scale.
    • @marshray: Ouch, that's kind of dismal. Perhaps we need a new term: "eventual scalability"
    • @adrianco: RT @rbranson: @cscotta load average is the worst thing ever. Slowly trying to evangelize it's demise as a reasonable metric. < +1 every 15 m

  • MIT Tech Review picks 10 breakthrough technologies: Smart Watches (really?), Memory implants (deciphering the code by which the brain forms long-term memories), Additive manufacturing (3-D printing), Supergrids (finally says Edison, DC powergrids), Temporary social media (sigh), Prenatal DNA sequencing (great for full lifecycle ad targeting), Baxter (compliant robots), Deep Learning (the singularity is near), Ultra-Efficient Solar Power (now we are talking). Prediction: We'll laugh at all this filter control talk once we have all of Google's datacenters and knowledge graph software implanted in our heads.

  • IBM on making movies using atoms as pixels. Characterization was a little thin but the plot was magnetic.

  • Lesson from Airbnb: Give yourself permission to experiment with non-scalable changes. Building better is better than building bigger.

  • Here's a short review by me on CyberStorm by Matthew Mather. Matthew is also the author of the most excellent Atopia Chronicles, a sprawling exploration of "artificial intelligence, distributed computing, nanotechnology, and the full range of humanity." CyberStorm is a chilling blow by blow of what could happen in a real cyber attack. As a programmer it's the implied idea of a kind of Crises OS built on a mesh of smartphones that I found most fascinating. Not much seems to be done in this area and even the how-to of writing such applications is rarely discussed. Could be interesting.

  • Now that's not a scalable business model for artists: "Galaxie 500's 'Tugboat' was played 7,800 times on Pandora in the first quarter of 2012, for which its three songwriters were paid a collective total of 21 cents, or seven cents each."

  • Graph: Faster Abstractions for Structured Computation. Example of optimizing a clearly stated graph computation with not 30X, but rather 20% overhead, though it took a lot of work.

  • Visualization IP Packets with Legos. Classic. 

  • Moot talks about how they use Redis as their primary database, not just for caching: We decided to design our Redis store to be split amongst many different Redis clusters from the beginning. We hash and split data into shards that contain all the relevant structures for that segment of data. The data is heavily sharded from the beginning and we can create more shards as necessary quickly and simply.

  • Dan Rayburn on why cable TV will not be replaced by Internet streaming...because it sucks: "of the 22.6 billion streams Conviva monitored in 2012, 60% of them had quality issues. 60%!"

  • Security corner: great plain-English explanation of a rainbow table; even better series of posts by John Graham-Cumming on one way functions

  • Google Tech Talks with a massive upload of videos on testing, many having to do with the automation and scaling of tests.

  • Jeff Darcy with a thoughtful list of ideas on How To Build a Distributed Filesystem and a good reiview of what he liked at FAST’13

  • If REST APIs are too chatty then why use REST in the first place? Fronting them with an aggregation layer seems quite complicated. Every complex evolving API eventually conversges on "Any Doit(Any)" as their real API.

  • Srihari Srinivasan has created Systems We Make to curate interesting distributed systems developed in both academia and the industry. Looks like a good source.

  • Now that's out of orbit thinking...NASA Uses Smartphones As Satellites: Smartphones have more than 100 times the computing power of satellites, including fast processors, multiple sensors, high-resolution cameras, GPS receivers and radios.

  • Interesting: Customers who spend less than $50K per year make up the largest group of AWS users, yet account for only 4 percent of total spend.

  • You are what you measure says David Crawford in his excellent explanation of Calculating rolling cohort retention with SQL: A lot of hard work for a small set of numbers, but those numbers are the life blood of your company. For BigData he also recommends Grab first, structure later

  • Azure bigger than you might have thought: Azure is Microsoft's billion-dollar baby – maybe

  • The long and lovingly story of scaling Viki, a video site focused on international content and community-driven subtitle translations. Some lessons: CLOBs in a database are rarely ever a good thing. Pride is a horrible and disgusting human trait. Having fewer or even no tests is probably better than having really slow tests and definitely better than having flaky tests. I’m too prideful. Inspecting code which makes use of an ORM is likely to reveal a SELECT N+1. A friend is someone who deserves your absolute honesty; a lover should, occasionally, receive the mercy of a white lie.

  • In VIRTUAL APPLIANCE PERFORMANCE IS BECOMING A NON-ISSUE Ivan Pepelnjak declares the virtualization tax something modern CPUs can easily pay: In a year or two, we’ll have plenty of software solutions and/or generic x86 hardware platforms capable of running very high speed virtual appliances. I would strongly recommend considering that in your planning and purchasing process. 

  • Not sure if this is techno child abuse, Daddy, what's a stream?: If you don't program a way for the disk source to feel the back pressure from the slow mobile connection, it will read the data full-speed and flood the server's memory by buffering everything. This is bad for servers with lots of clients and/or large media files.

  • Time may be real or an illusion, but as Ntop explains, you do really need Sub-microsecond Packet for Timestamps or you can't detect micro-burst problems on the new faster interfaces.

  • Status Code is a weekly e-mail digest for programmers that looks like it is a low spam high quality source of interesting links.

  • LinkedIn explains The technology behind EatIn: Android apps in Scala, iOS apps, and Play Framework web services. Potentially a good model if you are trying to figure out what you should do.

  • Using Vector Interfaces to Deliver Millions of IOPS from a Networked Key-value Storage Server: To address this increasing software-I/O gap, we propose using vector interfaces in high-performance networked systems. Vector interfaces organize requests and computation in a distributed system into collections of similar but independent units of work, thereby providing opportunities to amortize and eliminate the redundant work common in many high-performance systems. By integrating vector interfaces into storage and RPC components, we demonstrate that a single key-value storage server can provide 1.6 million requests per second with a median latency below one millisecond, over fourteen times greater than the same software absent the use of vector interfaces.

  • In Search of an Understandable Consensus Algorithm: Raft is a consensus algorithm for managing a replicated log. It produces a result equivalent to Paxos, and it is as efficient as Paxos, but its structure is different from Paxos; this makes Raft more understandable than Paxos and also provides a better foundation for building practical systems. 

  • SSMalloc: A Low-latency, Locality-conscious Memory: This paper presents a new memory allocator that provide low-latency and locality-conscious memory management with stable performance scalability even with a large number of application threads. The key design decisions underlying SSMalloc include: 1) providing low and predictable latency for memory management operations through carefully minimized critical path; 2) minimizing mmap system calls that might be contended in kernel; 3) adopting lock-free and mostly wait-free algorithms.

  • Greg Linden with a nice shinny new set of Quick Links