Myth: Google Uses Server Farms So You Should Too - Resurrection of the Big-Ass Machines

For a long epoch there was a strategy of scaling up by making ever bigger super computers. I had the pleasure of programming on a few large massively multi-processor machines from SGI and DEC. Beautiful, highly specialized machines that were very expensive. These met the double-tap extinction event of Moore's law and a Google inspired era of commodity machine based clusters and extreme software parallelism. Has the tide turned? Does it now make more sense to use big machines instead of clusters?

In Big-Ass Servers™ and the myths of clusters in bioinformatics, Jerm makes the case that for bionformatics, it's more cost effective to buy a Big-Ass Server instead of using a cluster of machines and a lot of specialized parallel programming techniques. It's a classic scale-up argument that has been made more attractive by the recent development of relatively inexpensive large machines. SeaMicro has developed a 512 core machine. Dell has a new 96 core server. Supermicro has 48 core machines. These are new options in the scale-up game that have not been available before and could influence your architecture choice.

Jerm's reasoning for preferring big-ass servers is:

  • The development of multicore/multiprocessor machines and memory capacity has outpaced the speed of networks. NGS analyses tends to be more memory-bound and IO-bound rather than CPU-bound, so relying on a cluster of smaller machines can quickly overwhelm a network.
  • NGS has forced the number of high-performance applications from BLAST and protein structure prediction to doing dozens of different little analyses, with tools that change on a monthly basis, or are homegrown to deal with special circumstances. There isn't time or ability to write each of these for parallel architectures.

Jerm then goes on to tackle several myths about why a cluster is not necessarily the best solution, well worth reading. The key seems to be the type of problem you are solving. For Google solving search, investing in a large permanent cluster to do one thing very well makes a lot of sense. For someone trying to solve a particular problem and then move on to the next problem, investing a lot of time in writing a parallelized solution doesn't make as much sense. It makes more sense to find a big-ass machine, program simply, and let it rip.

If agility and flexibility are key then clusters may not be right tool for your big data or big computation problem. It's interesting how this right tool for the right job idea keeps popping up and how technological innovation brings different options in and out of fashion.  Now might be an inflection point in tool choice with the rise of the new big machines.