« C is for Compute - Google Compute Engine (GCE) | Main | Paper: Logic and Lattices for Distributed Programming »

Stuff The Internet Says On Scalability For June 29, 2012 - The Velocity Edition

Judging from the tweet flow, Velocity looked like a riotous good time. In this video on the main themes at Velocity, after a little microphone enhanced violence, John Allspaw and Steve Souders identify resilience and automation as two of the big ideas behind building a faster and stronger web.

John says resiliency is the idea that we we don't live in a perfect world so trying to build perfect systems is counter productive. We have to accept failure as a baseline and think in terms of degrees of availability. All abstraction layers leak so every part of a system must be monitorable and open to introspection.

A focus on resilience means the web is growing up. Resilience has long been a requirement for "real" systems, it's great to see the web thinking in terms of the complex systems they've always been. For the Alpha and Omega on resilience you'll want to watch Dr. Richard Cook's inspiring talk on How Complex Systems Fail

Here are some of the most enjoyable Quotable Quotes from Velocity:

  • @guypod : LTE latency has roughly the same latency we had with dialup connections. 3G latency is akin to satellite... (@patmeenan at ‪#velocityconf‬)
  • @akucharski : Akamai produces 1.3 billion log lines every day! ‪#velocityconf‬
  • @mikeodea : Facebook: 6 billion mobile messages (!!) every 30 minutes ‪#velocityconf‬
  • @mmaretzke : ‪#velocityconf‬ Last ... mind-boggling ... Facebook facts: 3.8 trillion cache operations in 30 minutes! Unbelievable. Scaling Systems. 160m newsfeeds, 5bln realtime msgs, 10bln profile pics, 108 bln queries on mysql still 30 minutes
  • @atseitlin : "If I were to do things over again, I would think about the cloud first" ‪#velocityconf‬ @yammer on migrating to the cloud
  • @holydevil : RT @laraswanson: Swapping DNS architecture for a major site yielded ~3s of page load time decrease for a site. @tomdyninc at ‪#velocityconf‬
  • @mikeodea : Facebook: 3.8 trillion cache ops every 30 minutes. ‪#velocityconf‬
  • @Anselmo : BBC mobile page render costs ~15 Joules on a HTC Android phone, which is pretty fantastic. ‪#velocityconf‬
  • @atseitlin : Integration points are #1 risk to stability @mtnygard ‪#velocityconf‬. That's why we built the circuit breaker pattern http://bit.ly/LNCVr4 
  • @RealGeneKim : @mtnygard: "Antipattern #1: integrations are #1 risk to stability: every process call can/will kill you; even db calls" ‪#velocityconf‬
  • @adrianco : #velocityconf‬ preventing failure is ultimately less successful than responding quickly to it. That's what we use rollback for.
  • @ramarob : Twitter is 45% done converting monolithic Rails to modular Java apps. So 4+ yrs pain for alleged initial gain of rapid dev? ‪#velocityconf‬
  • @jbarciauskas : Bulkhead pattern: partition the system, allow partial failure. Apply at different levels: thread pools, load balancers ‪#velocityconf‬
  • @jonathanklein : "Don't optimize the little things" - I've said it before and it's a popular phrase at ‪#velocityconf‬ - you must measure and focus on big wins
  • @jhofmann : Facebook runs BGP all the way to their top or rack switches. One common protocol they can build tools around. ‪#velocityconf‬
  • @kenny_dee : Time between requests : 3 seconds to type a URL on Google Chrome, 15 seconds to select a search result on Google ! ‪#velocityconf‬ ‪#webperf‬
  • @souders : Good advice from @jrauser wrt data analysis: "Look at the extremes and you'll find things that are broken." ‪#VelocityConf‬
  • @SubmittedDenied : ~45% of twitter traffic now served off JVM stack ‪#velocityconf‬
  • @laraswanson : Crazy mobile energy consumption: Changing images on Amazon to JPEG format would save 20% joules; on Facebook it'd save 30% ‪#velocityconf‬
  • @CoteIndustries : But then as you go up to 20 racks, colo is better #velocityconf
  • @jbarciauskas : Per rack, EC2 servers are 5x more expensive, but inc. other costs: network, load balancing, racking/power, manpower, is equal ‪#velocityconf‬
  • @adrianco : ‪#velocityconf‬ Netflix dev teams push code 10s of times a day and rollback is needed a few times a week.
  • @dlutzy : Operations people are 1. Monitoring 2. Responding 3. Adapting 4. Learning to enable Resilience (Cook) ‪#velocityconf‬
  • @allspaw : "The normal world is not well-behaved." Dr. Richard Cook ‪#velocityconf‬
  • @ginablaber : Dr Cook: "Resilience in my field is a life/death question. We design for reliability, but we what we want is resilience." ‪#velocityconf‬
  • @atseitlin : More Cook: How to design for resilience? Trust people, reveal controls, be transparent, foster learning. Good stuff. ‪#velocityconf‬
  • @RealGeneKim : Cook: "withstand transients, recovery swiftly/smoothly, prioritize to serve high level goals, recognize/respond.." ‪#velocityconf‬
  • @RealGeneKim : Twitter: "we do dark, rolling releases: only way we can gain confidence in releases, despite using iago in pre-prod" ‪#velocityconf‬
  • @RealGeneKim : Twitter: "outcomes: we can launch massive features in parallel: team org now matches sw stack; ‪#velocityconf
  • @jschauma : "The best disaster plans can be impacted by actual disasters." Mike Christian at ‪#velocityconf‬
  • @jonathanklein : Theo Schlossnagle says that eroding data granularity over time to save space is a mistake at ‪#velocityconf‬
  • @thewebvy : Uploading data over 3G uses almost 2x as much battery as downloading, especially when amt of data approaches 260k+ ‪#velocityconf‬
  • @souders : 25% of total time is DNS for CDN resources ‪#VelocityConf‬
  • @grigs : Global avg fixed latency is 125 and avg mobile is 290… global mobile consumer avg latency is 307.3 ms per Cisco – @guypod ‪#velocityconf‬
  • @jbarciauskas : Stability patterns: circuit breakers, timeouts, decoupling middleware, handshakes, test harnesses ‪#velocityconf‬ Some good basic engineering
  • @willmeyer : Word. RT @mikebrittain: "Avoid passive-aggressive snark." Always. ‪#velocityconf‬

Notes from Velocity Conf 2012 by Pablo Mercado

Talks - Slidecks or Video

Reader Comments (4)

Nice compilation.

June 29, 2012 | Unregistered CommenterAshwin Jayaprakash

But...if there are 1bn Facebook users, and 6bn mobile messages every 30 minutes, and you and I don't get any, someone is getting a lot. What do they mean by "mobile messages"?

June 29, 2012 | Unregistered CommenterRichard

Wow, a lot of slides, a lot of videos, a lot of tweets...
In other words, a lot of speech that's worthless for me and nothing more.

July 7, 2012 | Unregistered Commenter-

What is "Swapping DNS architecture" ?

July 14, 2012 | Unregistered CommenterSairam

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>