« Strategy: Caching 404s Saved the Onion 66% on Server Time | Main | 7 Secrets to Successfully Scaling with Scalr (on Amazon) by Sebastian Stadil »

Digg: 4000% Performance Increase by Sorting in PHP Rather than MySQL

O'Reilly Radar's James Turner conducted a very informative interview with Joe Stump, current CTO of SimpleGeo and former lead architect at Digg, in which Joe makes some of his usually insightful comments on his experience using Cassandra vs MySQL. As Digg started out with a MySQL oriented architecture and has recently been moving full speed to Cassandra, his observations on some of their lessons learned and the motivation for the move are especially valuable. Here are some of the key takeaways you find useful:

  1. Precompute on writes, make reads fast. This is an oldie as a scaling strategy, but it's valuable to see how SimpleGeo is applying it to their problem of finding entities within a certain geographical region. Using Cassandra they've built two clusters: one for indexes and one for records. The records cluster, as you might imagine, is a simple data lookup. The index cluster has a carefully constructed key for every lookup scenario. The indexes are computed on the write, so reads are very fast. As reads dominate, this makes a lot of sense. Queries based on time are also precomputed. Joe mentions some special algorithms for spreading out data, which tends to cluster around geographical regions, but does not mention what these are. 
  2. Restrict what the user can do. The system is kept simpler by not allowing open ended queries. Users are allowed to perform a well defined set of operations that end up using highly optimized searches. They have no intention of being a generic database, they only intend to be able to serve geodata, well.
  3. The relation tool chain has failed for real-time. The relational database tool chain is not evolving. It has failed for large scale, real-time environments. Building scalable systems on a relational database requires building sharding, load balancing, resharding, cluster management, worrying about consistency, implementing distributed queries, and other layers yourself, so why bother? Cassandra does all that for you out of the box. Shutdown a server and Cassandra will handle all the remapping and rerouting automatically.
  4. Scaling practices turn a relational database into a non-relational database. To scale at Digg they followed a set of practices very similar to those used at eBay. No joins, no foreign key constraints (to scale writes), primary key look-ups only, limited range queries, and joins were done in memory. When implementing the comment feature a 4,000 percent increase in performance was created by sorting in PHP instead of MySQL. All this effort required to make a relational database scale basically meant you were using a non-relational database anyway. So why not just use a non-relational database from the start?
  5. Embrace and extend existing products rather than build your own. Cassandra allowed SimpleGeo to create custom data partitioning policies to better spread the data around. This meant a custom database didn't have to be created, an existing database could be extended to go that extra mile while still benefiting from a well supported highly functional database. This is also a lesson learned at Justin.tv and will likely become an even more important strategy as complexity increases.
  6. Scaling equals specialization. To scale often requires building highly custom, problem specific solutions.
  7. MySQL works fine for a certain problem set. Typically for relatively static data sets, relatively low query volumes, and relatively high latency requirements.

Related Articles

References (1)

References allow you to track sources for this article, as well as articles that were written in response to this article.

Reader Comments (23)

I totally agree with Joe these RDMBS is not going to scale well when it comes to mass and real time data. I still don't understand why companies like Oracle, IBM not trying to comeup with some technology that can remove the barriers of relational integrity. NoSQL databases really scale well interms of query and as well as updates. Moving the logic of integrity inside application is much better than having in database. But this is strange to hear that sorting also sucks in database. For all these years I thought RDBMS has great algorithms to do sorting better. So that means I can write my quick sort better in application layer than depending on soring on database :-)

March 23, 2010 | Unregistered CommenterKesav Kolla

Nice... yeah, MySQL can have problems when a sort requires a table scan. You should also take a look at MongoDB. I have been using in a system for the last nine months and it has simplified things a lot.

March 23, 2010 | Unregistered CommenterRyan

to : Kesav Kolla

With RDBMS, whether use integrity check within the database is only up to you. You can discard it and implement it in the application. And of course, this will loose some benefit from the RDBMS.

March 23, 2010 | Unregistered Commenterjametong

"companies like Oracle, IBM not trying to comeup with some technology that can remove the barriers of relational integrity"
Maybe they are, but if not it probably comes down to cost/benefit. There's maybe a few hundred sites large enough AND rich enough to fork over several million dollars a year in maintenance for a top of the range Oracle install to run web app. And some, like SimpleGeo, will be dealing with non-simple datatypes which are even harder to managee between servers.
More likely Oracle (or IBM) will wait until a player is a big enough financial prize, then either buy it (mySQL, TimesTen) or, if it is GPL, clone it (Oracle Enterprise Linux/Red Hat).

March 23, 2010 | Unregistered CommenterGary

"There's maybe a few hundred sites large enough AND rich enough to fork over ..."

That may be true now, but at the very least, that number will only get larger, not smaller. It's easier for new ways of using data to be found than for old ways to become useless. A lot of people's instincts about what is possible will take time to catch up.

March 23, 2010 | Unregistered CommenterSamuel Reid Hughes

Didn't slashdot figure out the sorting issue like *a decade* ago?

March 24, 2010 | Unregistered CommenterChristian

I too am sceptical of the 4000% claim. I'd like to see the details of that. And indeed what sorting is required for a comment system... Still I was interested enough to go look at the SimpleGEO website. Shame it's unreadable in i.e6. You'd have thought an "I'm interested in the beta" form would have been cross-browser compatible.

March 24, 2010 | Unregistered CommenterNiall

Reading the original article, it almost seems like Joe has never heard of MySQL cluster. He certainly doesn't address it and he complains of problems that MySQL cluster solves elegantly. Specifically, sharding that is opaque to the application and storing multiple copies of the data to provide redundancy.

If you are using MySQL as a NoSQL store (i.e. you aren't doing joins) then you should be using MySQL cluster.

You might want to change the second sentence in point 4. It states: "No joins" and then later in the same sentence: "joins were done in memory". Although Joe did say something like that, the bits he put in between that you left out mean that those two sentence fragments are not in conflict with each other.

Annoyingly, Joe doesn't mention WHY they saw a 4000% increase in performance when doing the comment sorts in PHP rather than in MySQL. I can guarantee you that it's not the sorting algorithm.

The two possibilities I can think of are that either MySQL was doing something else as well as the sort (such as a table scan as Ryan mentioned... but that's an indexing problem and won't go away just by moving the sorts out of MySQL) or the PHP sort is not sorting the same amount of data that the MySQL sort was sorting.

To address the sort *algorithm* more directly: all but the most naive of sort algorithms are O(nlogn) in the worst case and most of them are O(n) in the best case. The greatest difference we can see between sort algorithms is the difference between the worst case of one and the best case of another. This difference is a factor of the log of n where n is the number of items being sorted. To see a 40x improvement in sort speed by *only* changing the algorithm, the log of the number of comments would have to be around 40. This would mean the number of comments would have to be in excess of 1 trillion.

Joe also mentioned the concept of processing on writes, not reads. Why not extend this to sorting ? The downside is increased storage space. If you have two major sort methods (say, date and date reversed) then store users' comments sorted both ways. Storage is easy to scale, sorting is not.

Kesav Kolla: No, you can't write a better sort algorithm in PHP than the one already in MySQL (or any other sane database). Besides the issues already mentioned, you should be using the sort() PHP function which is written in C and not writing your own in PHP... unless you think you can write an algorithm that is 10x faster than the one in the PHP sort() function.

March 24, 2010 | Unregistered CommenterDave

These are problems with mysql rather than with RDBMSs per se. There may be good reasons for using the NoSQL approach but all I seem to read are good reasons for not using mysql.

March 24, 2010 | Unregistered CommenterEd

I am sure that MySQL's sort algorithm is basically algorithmically sound, however, the essential problem is not HOW the sorting is done, but WHERE it is done.

Doing sorting of big data sets on the (single or few) database servers is a waste of their valuable capacity, when it could be done on the (very many) web application servers instead, which scale better. Also by having the database not sort the data, you enable it to release locks and other resources sooner.

So by doing the sorting in the right place, you can gain capacity. It's not more efficient per se (indeed, sorting in PHP is almost certainly MUCH LESS efficient than MySQL).

March 24, 2010 | Unregistered CommenterMark Robson

I recently sat in on a conversation with lead developers from a large open source CMS, and listened to them talk at length about migrating large parts of the database into "Materialized Views" (basically, one strategy for doing item #4 in the list above).

I find such discussions seriously discouraging. It is indicative of the problem of shoe-horning data into the RDBMS paradigm.

For years I've been preaching document-oriented databases, but since 2008 there are now some fantastic publicly available (and open source) NoSQL databases. As I've started working more with MongoDB (and Cassandra soon), I find it harder and harder to continue writing "old style" web applications backed by SQL storage and hefty ORMs or translation layers.

Relational databases are *great* for relational data. But they are seriously lackluster when it comes to object and document storage. I, for one, am happy to see the tide turn back toward using tools that are right for the job.

March 24, 2010 | Unregistered CommenterMatt Butcher

Can everyone just go read this now?

Seriously, the guy knows what he's talking about, and spends the time to verify what he says. It's worth reading regarding the recent NoSQL fad and the myth that you can't scale a RDBMS.

March 24, 2010 | Unregistered Commenterd

"Scaling practices turn a relational database into a non-relational database".
A long...long time ago, there were no relational databases. There were databases, these are nowadays called hierarchical databases. Indeed these were a lot faster (2-3X), but you had to write a lot more code to get the data out and indeed sort in the application (you retrieved rows one by one and had to traverse manually through the table/database).

A simple query like e.g. 'select bla from table a, table b where a.id = b.id order by a.id' could easily costs a few pages or procedures (sections) to write in a 3GL language.
However, 4000% improvement can't be true, I would fire the DBA who set up the database.

March 24, 2010 | Unregistered CommenterBas Zurburg

d, making an argument based assuming the incompetence of others is really no argument at all. It's a strawman. And it's pretty low actually.

March 24, 2010 | Registered CommenterTodd Hoff

I'd second the yafla.com link.

I'm speaking as someone who manages TB-scale databases.

Digg reported a 30 GB database, with a 14 second lookup time for a friend query, prior to the optimization. That's abysmal performance.

I read this article, and the Digg precursor as: We were having database problems anchored in our bad design, and a move to NoSQL fixed that. It's hardly an indictment of RDBMS technology.

Claim #3 is ridiculous, to the point of absurdity. This is coming from a guy pitching his brand of scalability software.

March 24, 2010 | Unregistered CommenterBrad Wallace

We previous had probem data integerty when customer tried to edit via database access.trust me,for enterprise management it's better have relationship before we used consept denormilze.It's a'nt fast as as you think.We also got problem with slow itanium.We sratch out head to speed up system.

March 25, 2010 | Unregistered Commenterhafizan

Niall, IE6 is dying and fewer and fewer sites are supporting it. Expect your problems to get worse. You really should upgrade (or try to convice your IT department to upgrade) to a modern browser (IE8, FF3, Safari 4, etc.)

March 25, 2010 | Unregistered CommenterCACuzcatlan

4000% is really impressive ! Wow !

March 26, 2010 | Unregistered CommenterDiscodog

2 years ago, i had a problem of speed with mysql, it exec a query for a desicion cube in 2 hours. postgresql did the same query in 10 second without define index o someting similar. The problem is not the sql, the problem is the algorithm of a specific RDMBS

March 31, 2010 | Unregistered Commentermarxchante

Work divided between DB and app?
What about other layers? Dimensions?

e.g. "The Art of Scalability" talks about cube of axises along which applications can be split. A book full of good ideas.


george kyaw naing

April 4, 2010 | Unregistered Commentergeorge kyaw naing

30GB database and they were having scaling issues? Wow.

But then, they don't store data anyone cares about (who really cares about Digg), so it's not like incompetence is noticeable.

Try RDBMS-backed real-time write-heavy transaction processing in databases with 100s of billions of rows. And then come back and tell me it doesn't scale (It can and it does).

The problem isn't the RDBMS. The problem is programmers approaching the relational model like it's a key-value store and then complaining that it's not a key-value store when they spent all of an hour or two thinking about how to structure their data for performance, and now that they've discovered through pain how to structure it, applying V2 of their structure into the NoSQL solution and claiming it's the greatest thing since sliced bread.

I'll be waiting to see what happens when they start needing to mine that data, or need to do schema upgrades (Hahahaha), or run into integrity issues.

April 5, 2010 | Unregistered CommenterYetAnotherSQLScaler

HI YetAnotherSQLScaler,

I agree. I've commented about this about 8 weeks ago here:


You can forget data models, partitioning, etc. Developers won't do even domain models properly. Maybe you and I are BEHIND THE TIMES. Maybe we need to catch with EXTREME CODING etc. Some times, they won't even do glossary!!!

Maybe all applications and domains are as simple as "shopping cart and online store" modeling.
Maybe everything can be "refactored" whatever the phase of development.
Then, they will clutch at any straw that come up in their hands. That's sorting in PHP ;-)


george kyaw naing

April 5, 2010 | Unregistered Commentergeorge kyaw naing

I agree with the comment above--the main problem, although each problem differs in each situation-is that programmers treat this as something that it's not and get frustrated when they don't get the results they think are expected.

December 26, 2011 | Unregistered Commenterdonna

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>