hot links

Stuff The Internet Says On Scalability For March 30, 2012

High Scalability

30 Mar 2012 — 2 min read

Choosy Mothers Choose HighScalability:

Quotable quotes:
- @itarradellas: "Revolutions in science have often been preceded by revolutions in measurement"
- @jasongorman: Use dependency injection, not Spring. Use event-driven, asynchronous I/O, not Node.js. Use MVC, not http://ASP.NET MVC etc etc
- @bernardgolden: #netflix uses most aggressive #aws reservation system. Gets pricing down to ~ 33% of "list' pricing.
- @ikarzali: Hey, for all facebook's talk at scalability conferences, I have to say Timeline is super slow(!) Howz that memcache workin out for you now?
- Yahoo!: Amazon's Game-Changing Cloud Was Built By Some Guys In South Africa
- Foursquare: 1.5 billion check-ins from 15 million people at 30 million different places.
How OMGPOP scaled to 36 million users in three weeks. Draw Something has been downloaded 35+ million times; 1 billion pictures created at 3,000 pictures per second; Couchbase is used as the database; SoftLayer is their cloud providing tens of nodes for tens of thousands of operations per second; no downtime; $200 million acquisition after 7 weeks.
Chris Dixon got it 2/3rds right with Give away the diagnostic, sell the remedy. The most profitable part was missing: create the problem for which you give away a diagnostic that detects the problem for which you sell the remedy.
Period Pain, Period Pain part 2, Period Pain 3: Colm with a timely series of articles on the maddening nature of time, specifically how scheduled operations synchronize and cause havoc.
The Game of Distributed Systems Programming. Which Level Are You? Rslootma created a Maslow's Hierarchy of needs for distributed systems: Level 0: Clueless; Level 1: RPC; Level 2: Distributed Algorithms + Asynchronous messaging + Language support; Level 3: Distributed Algorithms + Asynchronous messaging + Purity; Level 4: Solid domination of distributed systems: happiness, piece of mind and a good night’s rest. Self-actualization can be yours, but of course, you have to ignore systems of levels to get there.
Subsecond Offset Heat Maps. Tremendous article by Brendan Gregg showing how to use heat maps of various performance metrics to track down performance problems in a way I've never seen before. Quite impressive.
MDCC: Multi-Data Center Consistency: the first optimistic commit protocol, that does not require a master or partitioning, and is strongly consistent at a cost similar to eventually consistent protocols. MDCC can commit transactions in a single round-trip across data centers in the normal operational case.
The Synchronization of Periodic Routing Messages: The paper considers a network with many apparently-independent periodic processes and discusses one method by which these processes can inadvertently become synchronized.
Power Management of Online Data-Intensive Services. James Hamilton with great coverage on two papers on power management: Power Provisioning for a Warehouse-sized Computer - we should oversell power, the most valuable resource in a data center. Just as airlines oversell seats, their key revenue producing asset, datacenter operators should oversell power; Power Management of Online Data-Intensive Services - In contrast to other server workloads, for which idle low-power modes have shown great promise, for OLDI (Online Data-Intensive) workloads we find that energy-proportionality with acceptable query latency can only be achieved using coordinated, full-system active low-power modes.

Stuff The Internet Says On Scalability For March 30, 2012

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale