Stuff The Internet Says On Scalability For August 3, 2012

It's HighScalability Time:

  • Quotable Quotes:
    • Ross Tur: the tricks you learned to make things big are not the same tricks you can apply to make things infinite. 
    • @gclaramunt: Son, I'm getting old, but let me tell you a secret: programming is hard, and high scalability and concurrent programming... frigging hard!
    • @Carnage4Life: At Apple the iOS team didn't see iPhone hardware or hardware team see OS until it shipped
    • @adrianco:  #ebspio caps iops but latency variance is much lower than EBS
    • @bernardgolden: RT @peakscale: A culture of automation is 10x more important than deployment/test/monkey thing you'd like to discuss < devops calling
    • @JayCollier: 50 years ago, school standardization was needed for scale. Now, scalability and flexibility (variability) can coexist. #FOL2012
    • @adrianco: Compared to vanilla EBS many times better for random reads. Bandwidth limits both for sequential and writes. #ebspio
    • @SQLPerfTips: More hardware won't solve response time problems. Proper indexing does.
    • @adrianco: I did some EBS Provisioned IOPS benchmarking. Excellent if you need lots of low latency random reads. #ebspio
    • @b6n: Can't wait for the new infinitely scalable, zero latency, totally consistent, transactional, distributed DB from @ryah and @antirez.
    • @theuncommonfan: if they can get 104 minutes out of "Where the Wild Things Are" they should be able 2 get 30 movies out of the Hobbit
  • Having just watched a Snark themed Inspector Lewis episode on BBC, I could not but help read a Joel Spolsky posting on The Hunting of the Snark, not about finding murderers, but about finding unfriendly comments, which might actually might lead to murder and make a good episode...
  • Rackspace takes one giant step for all cloud kind, announcing they are running completely on OpenStack. Nice to see competition. Also, Brian Aker on "Scaling OpenStack Technology. Lessons From The Field"
  • Superb explanations of How are bloom filters used in HBase? Lars George: In a very busy system using bloom filters with the matching update or read patterns can save a huge amount of IO. Nicolas Spiegelberg: BloomFilters provide a lightweight in-memory structure to reduce those N disk reads to only the files likely to contain that Row (N-B).  
  • Having fun with Redis Replication between Amazon and Rackspace. 3scale found crossing clouds over the Internet can be tricky. Network bandwidth became scarce when replicating fast Redis servers across the cloud. Compression fixed the problem.
  • Are there really any platforms? Good discussion with pg on Hacker News of the viability of Twitter vs Facebook vs iOS vs Android.
  • Great talk by Ross Tur: Ceph: The Future of Storage. Love the storage analogy for Cepth, a distributed object system, using how an infinite hotel could not work the same way as even a very very big hotel. You couldn't have a centralized reservation system, for example. It would need to assign you a room by deterministic placement algorithm and intelligent nodes would then move you around to different nodes as needed, completely transparent to you. Fun way to look at storage.
  • In the best Internet tradition everyone can learn a lesson or three from two excellent post-portems by Boundary on some recent problems.
    • Streaming Failure Post-Mortem 7/31. Everything really is connected and these are pretty typical problems for complex distributed systems: Global server load balancing and DNS caching causes a thundering herd problem; this caused blocking connection attempts on the main data consumption threads which caused  starvation and OOM errors in queues; restarts caused expensive recovery operations which caused some client downtime. Fixes: intelligent backoff logic, remove dead data, move connection code out of main thread, intelligent load balancing, seperate data plane from control plane.
    • Kobayashi Post-Mortem 7/31. A configuration change increasing open file limits caused intermittent ulimit kills. This partial failure cascaded to our streaming system and exposed a riak bug in which  portable identity cluster can become fragmented. Fix: improve debugging by using tools to consolidate and visualize logs; more careful upgrades; improve mental model of how the system works and improve understanding of failure scenarios.
  • It's not so much the error as what happens after the error:
    • Hosting.com experienced a human powered power failure. Time to recovery after a failure seems to be the real problem. Power was on in 11 minutes. Databases, up to 5 hours later. 
    • Azure was down a few hourse because of a misconfigured network device which "triggered previously unknown issues in another network device within that cluster."
  • Have Java garbage collection problems made you wistfully remember C++ corruption problems? What about RAM limitations? Or multi-core limitations? Java JVM wizards Azul Systems have open sourced what may be a solution: Zing JVM which supports pausless GC and other JVM goodness. The JVM has always been an incredible compromise as a system component. That may have changed.
  • Dilbert speaks truth to BigData
  • Spring and RabbitMQ – Behind India's 1.2 Billion Person Biometric Database. The architecture behind a few thousand core personal identity system using HDFS, HBase, Hive, Pig, Zookeeper, MySQL, SEDA, MongoDB, Solr, and GridGain.
  • Too many NoSQL databases. A fun Google Group thread for listing all the people who haven't built a NoSQL database.
  • A detailed HBase Replication Overview: The underlying principle of HBase replication is to replay all the transactions from the master to the slave. This is done by replaying the WALEdits (Write Ahead Log entries) in the WALs (Write Ahead Log) from the master cluster, as described in the next section. These WALEdits are sent to the slave cluster region servers, after filtering (whether a specific edit is scoped for replication or not) and shipping in a customized batch size (default is 64MB). 
  • Microsoft compared SPDY and HTTP performance: if one applies all the known optimizations to HTTP 1.1, then SPDY does not always have a significant performance advantage. We also find that for small web pages, the overhead of SSL handshakes can have a significant impact on SPDY’s performance.
  • How London Prepared For The First Mobile Olympics: Here’s a nice Moore’s Law scenario for you: 15 years ago it took 6-8 weeks and $1 million to test a web application with a load of 4000 concurrent users. And that was just for the software license – you’d still be on the hook for the servers. These days that same test, servers included, takes a couple of days and $1000. That’s one tenth of one percent of the cost 15 years ago.
  • Brad Hedlund talks data center networking design in Construct a Leaf Spine design with 40G or 10G? An observation in scaling the fabric. There are a lot of tradeoffs that have to be considered. It's not a simple matter of density. Great comment discussion as well.
  • Epic post by Paul Khuong on how Binary Search Is a Pathological Case for Caches: The slowdown caused by aliasing between cache lines when executing binary searches on vectors of (nearly-)power-of-two sizes is alarming. The ratio of runtimes between the classic binary search and the offseted quaternary search is on the order of two to ten, depending on the test case.
  • Processing a Trillion Cells per Mouse Click: combine the advantages of columnar data layout with
    other known techniques (such as using composite range partitions) and extensive algorithmic engineering on key data structures. The main goal of the latter being to reduce the
    main memory footprint and to increase the efficiency in processing typical user queries. In this combination we achieve large speed-ups. These enable a highly interactive Web UI
    where it is common that a single mouse click leads to processing a trillion values in the underlying dataset.
  • Hold onto your bananas, Netflix has released Chaos Monkey into the wild. Can your system really handle failures. This is how you find out.

This weeks selection: