« Big Data on Grids or on Clouds? | Main | Building Scalable Systems Using Data as a Composite Material »

10 eBay Secrets for Planet Wide Scaling

You don't even have to make a bid, Randy Shoup, an eBay Distinguished Architect, gives this presentation on how eBay scales, for free. Randy has done a fabulous job in this presentation and in other talks listed at the end of this post getting at the heart of the principles behind scalability. It's more about ideas of how things work and fit together than a focusing on a particular technology stack.

Impressive Stats

In case you weren't sure, eBay is big, with lots of: users, data, features, and change...

  • Over 89 million active users worldwide
  • 190 million items for sale in 50,000 categories
  • Over 8 billion URL requests per day
  • Hundreds of new features per quarter
  • Roughly 10% of items are listed or ended every day
  • In 39 countries and 10 languages
  • 24x7x365
  • 70 billion read / write operations / day
  • Processes 50TB of new, incremental data per day
  • Analyzes 50PB of data per day

10 Lessons

The presentation does a good job explaining each lesson, but the list is...

  1. Partition Everything - if you can't split it, you can't scale it. Split everything into manageable chunks by function and data.
  2. Asynchrony Everywhere - connect independent components through event queues
  3. Automate Everything - components should automatically adjust and the system should learn and improve itself.
  4. Remember Everything Fails - monitor everything, provide service even when parts start failing.
  5. Embrace Inconsistency - pick for each feature where you need to be on the CAP continuum, no distributed transactions, inconsistency can be minimized by careful operation ordering, become eventually consistent through async recovery and reconciliation.
  6. Expect (R)evolution - change is constant, design for extensibility, incrementally deploy changes.
  7. Dependencies Matter - minimize and control dependencies, use abstract interfaces and virtualization, components have an SLA, consumers responsible for recovering from SLA violations.
  8. Be Authoritative - Know which data is authoritative, which data isn't, and treat it accordingly.
  9. Never Enough Data - data drives finding optimization opportunities, predictions, recommendations, so save it all.
  10. Custom Infrastructure - maximize the utilization of every resource.

Related Articles



Reader Comments (5)

I love designing large systems but can't even imagine 50PB of data analysis. Wow!

November 19, 2009 | Unregistered CommenterXailor

Ironically I came across this article on the day eBay experiences a massive backend failure relating to their search engine.

November 21, 2009 | Unregistered CommenterAnonymous

Any particular reason, apart from joins etc for using MySQL in memory engine , instead of memcache for personalization and session cache.

November 24, 2009 | Unregistered CommenterRaj Satya

@Raj, durability would be my guess why not the memcache(d) you talk about.

November 30, 2009 | Unregistered CommenterXailor

what does "it's ..consumer’s responsibility to manage unavailability and SLA violations?" Shouldn't the service provider do everything possible to satisfy the availability guarantee in the SLA? I think that currently SPs are not doing enough in terms of managing availability, perhaps due to the fact that it's too difficult/costly for them. It's a lot easier for them to refund your money (or worse, ask you to restart your work) without paying a hefty penalty.

June 21, 2011 | Unregistered CommenterSurvirn

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>