Have We Reached the End of Scaling?

This is an excerpt from my article Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud.

Have we reached the end of scaling? That's what I asked myself one day after noticing a bunch of "The End of" headlines. We've reached The End of History because the Western liberal democracy is the "end point of humanity's sociocultural evolution and the final form of human government."  We've reached The End of Science because of the "fact that there aren't going to be any obvious, cataclysmic revolutions." We've even reached The End of Theory because all answers can be found in the continuous stream of data we're collecting. And doesn't always seem like we're at The End of the World?

Motivated by the prospect of everything ending, I began to wonder: have we really reached The End of Scaling?

For a while I thought this might be true. The reason I thought the End of Scaling might be near is because of the slow down of potential articles at my HighScalability.com website, which audaciously promises to help people build bigger, faster, and more reliable websites. When I started the site there was a flurry of finding and documenting common solution to scaling problems. After a time there wasn't as much to write about. This isn't to say scaling is easy. Not at all, but the general outline of how to scale small to cloud based systems is more-or-less known. This also isn't to say there isn't anything to report. There is, but it's usually a refinement rather than something completely new.

Then I got to thinking, if I really don't think any of those other "end of" ideas are true, then we probably haven't really reached the end of scaling either. More than likely what we've reached is the end of my vision. Time to get my eyes checked.

With a pair of brand new specs it was clear we are really just getting started at this scalability game. I was way too focused on what was in front of me. Probably my prescription. What I needed to do was look ahead. What is ahead is foreshadowed by some of the difficulties Facebook has in scaling.

Fortunately Facebook is pretty open about their infrastructure, so we learn a lot from their experience. From their information it's clear that building social networks for hundreds of millions of people, who are not even densely connected yet, is very challenging. And they are one the most successful companies in the world at scaling.

Building social networks far harder than scaling typical applications because:

  1. All data is active all the time.
  2. It's hard to partition this sort of system because everyone is connected.
  3. Everything must be kept in RAM cache so that the data can be accessed as fast as possible.

This is a blueprint for the future. To make all this magic work Facebook spends hundreds of millions of dollars on datacenters, they have 30K+ machines, 300 million active users, 80 billion photos, they serve 600,000 photos a second, they have 28 terabytes of  cache, they generate 25TB of logging data per day, and their caching tier services 120 million queries every second. Phew!

Now let us imagine what will happen in the future by thinking about what we can't do yet. No social networking site will support 7 billion friends. No social networking site supports updating presence to 7 billion friends. No site can handle a comment thread of 7 billion people. No photography site will allow 7 billion people to upload pictures. No lifestreaming site can wireless stream HD input from 7 billion eye glass mounted cams. No virtual world allows 7 billion people to roam free and unhindered by meat conventions. No music site allows 7 billion people listen to music at the same time.

In short, we don't have worldwide applications...yet. Even our most popular web apps are used by a relatively few number of users. As you can see from Facebook's experience, ramping up to this scale is something completely different than what we are capable of now. We don't have architectures to meet this scale of problems. A warmed over cloud won't work. A few monolithic datacenters scattered throughout the world won't work either.

Why might it matter? I hope it's not so the entire world can know how another interchangeable starlet is having plastic surgery, while in rehab, during an multiple adoption process. As a world we are facing worldwide problems that can only be solved with worldwide cooperation. Wouldn't it be something if we the people of the world, in order to form a more perfect world, could actually talk to each other without politicians and special interests getting in the way?

Radical? Massive cooperation works as a way to solve problems. What we need is a supporting technology, politics, and culture.

Imagine if Einstein or Mahatma Gandhi were still alive. It would seem plausible that 7 billion people might want to follow them, read about their fresh insights, and what they had for lunch. It seems reasonable that if there's some grand planet impacting scheme to combat global warming that a worldwide poll of 7 billion people would be run. And if the newest boy robot band sensation released a new song, shouldn't 7 billion people know that immediately?

So planet sized systems are reasonable to think about, they just aren't even vaguely possible at the moment. And that's just considering systems of people only.

Now let's bring in smart sensors. Smart sensors are where everything in the world--books, cars, phones, pets, appliances, etc--will have an IP address and send everything else in the world information about their inner feelings. To software, smart sensors are just like people, only a bit more single minded.

Imagine knowing the instantaneous power usage of an entire city by following millions of power sensors. Imagine creating your own weather forecast by following billions of weather sensors. Imagine competing medical services following information about your blood pressure, which medicines you've taken, and your health care coverage status all in real-time. Imagine a general following every aspect of a battlefield.

Now we are not talking about a lousy 7 billion people anymore, we are talking about designing systems where trillions of entities follow each other, talk to each other, read each other, post to each other, often at prodigious data rates.

Adding another level of scale is the exponentially growing number of connections between entities.  This is what often drags down social networking sites. Users with a large number of followers cause a lot work on every post. The social graph must be traversed at least once on every post to see who should get a post. Traversing very large graphs in real-time is still a ways off. And handling the work generated by each post can be overwhelming. Take Digg for example. happen. If the average Digg user has 100 followers that’s 300 million diggs day, 3,000 writes per second, 7GB of storage per day, and 5TB of data spread across 50 to 60 servers. Now imagine what happens with trillions of entities. Oh my!

We are talking about systems 4 or 5 magnitudes (10,000 to 100,000 times) larger than our largest systems today. A magnitude means 10 times larger. For an idea of how big things grow by powers of 10 take a look at this Cosmic Voyage video narrated by Morgan Freeman. It's impressive.

The numbers, the complexity, and the need are there. What we need next is way to build planet-scale applications.

If you would like to read the rest of the article please take a look at Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud.