« Strategy: Using Lots of RAM Often Cheaper than Using a Hadoop Cluster | Main | Stuff The Internet Says On Scalability For April 19, 2013 »

Facebook Secrets of Web Performance

This is a repost of part 1 of an interview I did for the Boundary blog.

Boundary: What is Facebook’s secret sauce for managing what’s got to be the biggest Big Data project, if you will, on the Web?

Hoff: From several presentations we’ve learned what Facebook insiders like Aditya Agarwal and Robert Johnson, both former Directors of Engineering, consider their secret sauce:

  • Scaling Takes Iteration. Solutions often work in the beginning, but you’ll have to modify them as you go. PHP, for example, is simple to use at first, but is not a good choice when you have tens of thousands of web servers.
  • Scaling Takes Iteration. You can say that again.
  • Don’t Over-Design. Just use what you need as you scale your system out. Figure out where you need to iterate on a solution, optimize something, or completely build a part of the stack yourself.
  • Choose the Right Tool for the Job. Realize that any choice comes with overhead. If you really need to use Python then go ahead and, we’ll try to help you succeed. Yet with that choice there is overhead, usually across deployment, monitoring, ops, and so on.
  • Get the Culture Right. Build an environment internally which promotes building the right thing first and fixing as needed. Stop worrying about innovating, about breaking things, thinking big and thinking about what is the next thing you need to build after the building the first thing. Isolate the part of the culture that you value and want to preserve. It doesn’t happen automatically.
  • Move Fast. Get to market first. It’s OK if you break things. For example, Facebook runs their entire Web tier on HipHop which was developed by three people. This is a risky strategy. It brings the site down regularly (out of memory, infinite loops), but there’s a big potential payoff as they figure out how to make it work.
  • Empower Small Teams. Small teams can do great things. Facebook Search, photos, chat and HipHop were all the result of small teams. Get the right set of people, empower them and let them work.
  • People Matter Most. It’s people who build and run systems. The best tools for scaling are an engineering and operations teams that can handle anything.
  • Scale Horizontally. Handling exponentially growing traffic requires spreading load arbitrarily across many machines.
  • Measure Everything. Production is where the really useful data comes from. Measure both system and application level statistics to know what’s happening.
  • Gives Teams Control and Responsibility. Responsibility requires control. If a team is responsible for something they must also control it.

All these principles work together to make a self-reinforcing virtuous circle. You can’t move fast unless you have small teams who have control and responsibility. You can’t know how your changes are working unless you get those changes into production and measure results. You can’t move code into production unless people feel responsible for moving out working code. You can’t handle the scale unless you figure out how to scale horizontally, move fast and measure everything– that all comes down to good people.

But the above is not the whole of the story. Not so obvious is the role of opportunity. A pattern we often see is that companies on the leading edge see problems before everyone else, so they solve those problems before everyone else. We see a blast wave of innovation coming from technological hotspots like Google, Netflix, Twitter and Facebook.

Boundary: What other major websites do you think are doing a great job of scaling with demand, keeping users happy and response times high?

Hoff: We have a great industry. People are constantly willing to share their experiences, share their code and talk about what works. My wife is a tax accountant and they definitely don’t have the same vibe which is a little sad. There are a lot of unbelievably smart and passionate people in this field and total quality only rises the more people talk about how to build great stuff.

It’s also pretty obvious to me that having a quality site and willingness to share are linked. There are many companies I could list that fall into this category, but these stand out: Twitter, Etsy, Facebook, Google, Netflix, Amazon and StackExchange. Some other important contributors include: Airbnb, Tumblr, Instagram, TripAdvisor, Heroku, Prismatic, 37signals, Pinterest and Yahoo.

There are literally hundreds of others that could be mentioned, but these companies have continually and enthusiastically contributed to advancing the state of the art in Web performance. I feel bad already, however, because I know I’m missing some.

Reader Comments (4)

I have to say that the advice in this post is completely worthless! What a waste of time!

August 29, 2013 | Unregistered CommenterPatratel

Yeah - this is awesome. Perfectionism is a big problem and it paralyzes a lot of organizations.

And I love that they are fearless about building the stack themselves if they're not satisfied with existing technology. One of the worst urban myths that floats around IT circles - is that if something that remotely matches what you're doing exists, you should use it without question. The big guys don't follow this myth - and it's one of the reasons they're the big guys. :)

Thanks for this!

October 21, 2013 | Unregistered CommenterKyle

You're right Kyle, the most worst myths around is :"don't re invent the wheel"!
sometimes you have to reinvent the wheel again and again :)

November 3, 2013 | Unregistered CommenterRachid

Facebook can afford to break - "It’s OK if you break things" - because it isn't important!

January 29, 2014 | Unregistered Commenterjon

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>