Lessons Learned from Scaling Uber to 2000 Engineers, 1000 Services, and 8000 Git repositories
Wednesday, October 12, 2016 at 8:56AM
Todd Hoff in Example

For a visual of the growth Uber is experiencing take a look at the first few seconds of the above video. It will start in the right place. It's from an amazing talk given by Matt Ranney, Chief Systems Architect at Uber and Co-founder of Voxer: What I Wish I Had Known Before Scaling Uber to 1000 Services (slides).

It shows a ceaseless, rhythmic, undulating traffic grid of growth occurring in a few Chinese cities. This same pattern of explosive growth is happening in cities all over the world. In fact, Uber is now in 400 cities and 70 countries. They have over 6000 employees, 2000 of whom are engineers. Only a year and half a go there were just 200 engineers. Those engineers have produced over 1000 microservices which are stored in over 8000 git repositories.

That's crazy 10x growth in a crazy short period of time. Who has experienced that? Not many. And as you might expect that sort of unique, compressed, fast paced, high stakes experience has to teach you something new, something deeper than you understood before.

Matt is not new to this game. He was co-founder of Voxer, which experienced its own rapid growth, but this is different. You can tell while watching the video Matt is trying to come to terms with what they've accomplished.

Matt is a thoughtful guy and that comes through. In a recent interview he says:

And a lot of architecture talks at QCon and other events left me feeling inadequate; like other people- like Google for example - had it all figured out but not me.

This talk is Matt stepping outside of the maelstrom for a bit, trying to make sense of an experience, trying to figure it all out. And he succeeds. Wildly.

It's part wisdom talk and part confessional. "Lots of mistakes have been made along the way," Matt says, and those are where the lessons come from.

The scaffolding of the talk hangs on WIWIK (What I Wish I Had Known) device, which has become something of an Internet meme. It's advice he would give his naive, one and half year younger self, though of course, like all of us, he certainly would not listen.  

And he would not be alone. Lots of people have been critical of Uber (HackerNewsReddit). After all, those numbers are really crazy. Two thousand engineers? Eight thousand repositories? One thousand services? Something must be seriously wrong, isn't it?

Maybe. Matt is surprisingly non-judgemental about the whole thing. His mode of inquiry is more questioning and searching than finding absolutes. He himself seems bemused over the number of repositories, but he gives the pros and cons of more repositories versus having fewer repositories, without saying which is better, because given Uber's circumstances: how do you define better?

Uber is engaged in a pitched world-wide battle to build a planetary scale system capable of capturing a winner-takes-all market. That's the business model. Be the last service standing. What does better mean in that context?  

Winner-takes-all means you have to grow fast. You could go slow and appear more ordered, but if you go too slow you’ll lose. So you balance on the edge of chaos and dip your toes, or perhaps your whole body, into chaos, because that’s how you’ll scale to become the dominant world wide service. This isn’t a slow growth path. This a knock the gate down and take everything strategy. Think you could do better? Really?

Microservices are a perfect fit for what Uber is trying to accomplish. Plug your ears, but it's a Conway's Law thing, you get so many services because that's the only way so many people can be hired and become productive.

There's no technical reason for so many services. There's no technical reason for so many repositories. This is all about people. mranney sums it up nicely:

Scaling the traffic is not the issue. Scaling the team and the product feature release rate is the primary driver.

A consistent theme of the talk is this or that is great, but there are tradeoffs, often surprising tradeoffs that you really only experience at scale. Which leads to two of the biggest ideas I took from the talk:

This is one of those talks you have to really watch to understand because a lot is being communicated along dimensions other than text. Though of course I still encourage you to read my gloss of the talk :-)

Stats (April 2016)


The Cost of Having Lots of Languages

The Cost of RPC


Operational Issues

Performance Issues

Fanout Issues - Tracing


Load Testing

Failure Testing


Open Source



Related Articles

Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.