Horizontal Scaling

Todd Hoff's picture

Problem: Mobbing the Least Used Resource Error

A thoughtful reader recently suggested creating a series of posts based on real-life problems people have experienced and the solutions they've created to slay the little beasties. It's a great idea. Often we learn best from great trials and tribulations. I'll start off the new "Problem Report"
feature with a diabolical little problem I dubbed the "Mobbing the Least Used Resource Error." Please post your own. And if you know someone with an interesting problem report, please tag them too. It could be a lot of fun. Of course, feel free to scrub your posts of all embarrassing details, but be sure to keep the heroic parts in :-)

The Problem


There's an unexpected and frequently fatal type of error that can happen when new resources are added to a horizontally scaled architecture. Because the new resource has the least of something, load or connections or whatever, a load balancer configured with a least metric will instantaneously direct all new traffic to that new resource. And bam! Your system dies. All the traffic that was meant to be spread across your entire cluster is now directed like a laser beam to one small part of it.

I love this problem because it's such a Heisenberg. Everyone is screaming for more storage space so you bring up a new filer. All new data streams flow to the new filer and it crumbles and crawls because it can't handle the load for the entire system. It's in the very act of turning up more storage you bring your system down. How "cruel world the universe hates me" is that?

Todd Hoff's picture

Strategy: Diagonal Scaling - Don't Forget to Scale Out AND Up

All the cool kids advocate scaling out as the secret sauce of scaling. And it is, but don't forget to serve some tasty "scaling up" as a side dish. Scaling up doesn't have to mean buying a jet propelled, liquid cooled, 128 core monster super computer. Scaling up can just mean buying at the high end of the commodity buffet by buying more cores, more memory and using a shared nothing architecture to take advantage of all that power without adding complexity. Scale out when you need to, but big beefy boxes can absorb a lot of load before it's necessary to hit up your data center for more rack space. Here are a few examples of scaling out and up:

Todd Hoff's picture

Paper: MySQL Scale-Out by application partitioning

Eventually every database system hit its limits. Especially
on the Internet, where you have millions of users
which theoretically access your database simultaneously,
eventually your IO system will be a bottleneck. [A] promising but more complex solution with nearly no scale-out limits is application partitioning. If
and when you get into the top-1000 rank on alexa [1], you have to think about such solutions.

A Quick Hit of What's Inside

Horizontal application partitioning, Vertical application partitioning, Disk IO calculations, How to partition an entity

Syndicate content