Stuff The Internet Says On Scalability For March 30, 2012

Choosy Mothers Choose HighScalability:

  • Quotable quotes:
    • @itarradellas: "Revolutions in science have often been preceded by revolutions in measurement" 
    •  @jasongorman: Use dependency injection, not Spring. Use event-driven, asynchronous I/O, not Node.js. Use MVC, not http://ASP.NET MVC etc etc
    • @bernardgolden: #netflix uses most aggressive #aws reservation system. Gets pricing down to ~ 33% of "list' pricing.
    • @ikarzali: Hey, for all facebook's talk at scalability conferences, I have to say Timeline is super slow(!) Howz that memcache workin out for you now?
    • Yahoo!: Amazon's Game-Changing Cloud Was Built By Some Guys In South Africa
    • Foursquare: 1.5 billion check-ins from 15 million people at 30 million different places.
  • How OMGPOP scaled to 36 million users in three weeks. Draw Something has been downloaded 35+ million times; 1 billion pictures created at 3,000 pictures per second; Couchbase is used as the database; SoftLayer is their cloud providing tens of nodes for tens of thousands of operations per second; no downtime; $200 million acquisition after 7 weeks.
  • Chris Dixon got it 2/3rds right with Give away the diagnostic, sell the remedy. The most profitable part was missing: create the problem for which you give away a diagnostic that detects the problem for which you sell the remedy.
  • Period Pain, Period Pain part 2, Period Pain 3: Colm with a timely series of articles on the maddening nature of time, specifically how scheduled operations synchronize and cause havoc. 
  • The Game of Distributed Systems Programming. Which Level Are You? Rslootma created a Maslow's Hierarchy of needs for distributed systems: Level 0: Clueless; Level 1: RPC; Level 2: Distributed Algorithms + Asynchronous messaging + Language support; Level 3: Distributed Algorithms + Asynchronous messaging + Purity; Level 4: Solid domination of distributed systems: happiness, piece of mind and a good night’s rest. Self-actualization can be yours, but of course, you have to ignore systems of levels to get there. 
  • Subsecond Offset Heat Maps. Tremendous article by Brendan Gregg showing how to use heat maps of various performance metrics to track down performance problems in a way I've never seen before. Quite impressive.
  • MDCC: Multi-Data Center Consistency:  the first optimistic commit protocol, that does not require a master or partitioning, and is strongly consistent at a cost similar to eventually consistent protocols. MDCC can commit transactions in a single round-trip across data centers in the normal operational case. 
  • The Synchronization of Periodic Routing Messages: The paper considers a network with many apparently-independent periodic processes and discusses one method by which these processes can inadvertently become synchronized.
  • Power Management of Online Data-Intensive Services. James Hamilton with great coverage on two papers on power management:  Power Provisioning for a Warehouse-sized Computer - we should oversell power, the most valuable resource in a data center. Just as airlines oversell seats, their key revenue producing asset, datacenter operators should oversell power; Power Management of Online Data-Intensive Services - In contrast to other server workloads, for which idle low-power modes have shown great promise, for OLDI (Online Data-Intensive) workloads we find that energy-proportionality with acceptable query latency can only be achieved using coordinated, full-system active low-power modes.