The Tumblr Architecture Yahoo Bought for a Cool Billion Dollars
Monday, May 20, 2013 at 8:25AM
Todd Hoff

It's being reported Yahoo bought Tumblr for $1.1 billion. You may recall Instagram was profiled on HighScalability and they were also bought by Facebook for a ton of money. A coincidence? You be the judge.

Just what is Yahoo buying? The business acumen of the deal is not something I can judge, but if you are doing due diligence on the technology then Tumblr would probably get a big thumbs up. To see why, please keep on reading...

With over 15 billion page views a month Tumblr has become an insanely popular blogging platform. Users may like Tumblr for its simplicity, its beauty, its strong focus on user experience, or its friendly and engaged community, but like it they do.

Growing at over 30% a month has not been without challenges. Some reliability problems among them. It helps to realize that Tumblr operates at surprisingly huge scales: 500 million page views a day, a peak rate of ~40k requests per second, ~3TB of new data to store a day, all running on 1000+ servers.

One of the common patterns across successful startups is the perilous chasm crossing from startup to wildly successful startup. Finding people, evolving infrastructures, servicing old infrastructures, while handling huge month over month increases in traffic, all with only four engineers, means you have to make difficult choices about what to work on. This was Tumblr’s situation. Now with twenty engineers there’s enough energy to work on issues and develop some very interesting solutions.

Tumblr started as a fairly typical large LAMP application. The direction they are moving in now is towards a distributed services model built around Scala, HBase, Redis, Kafka, Finagle,  and an intriguing cell based architecture for powering their Dashboard. Effort is now going into fixing short term problems in their PHP application, pulling things out, and doing it right using services.

The theme at Tumblr is transition at massive scale. Transition from a LAMP stack to a somewhat bleeding edge stack. Transition from a small startup team to a fully armed and ready development team churning out new features and infrastructure. To help us understand how Tumblr is living this theme is startup veteran Blake Matheny, Distributed Systems Engineer at Tumblr. Here’s what Blake has to say about the House of Tumblr:

Site:  http://www.tumblr.com/

Stats

Software

Hardware

Architecture

Old Tumblr

New Tumblr

Internal Firehose

Cell Design for Dashboard Inbox

On Being a Startup in New York

Team Structure

Software Deployment

Development

Hiring Process

Lessons learned


I’d like to thank Blake very much for the interview. He was very generous with his time and patient with his explanations. Please contact me if you would like to talk about having your architecture profiled.

Related Articles

Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.