This is a guest post by Aaron Sullivan, Director & Principal Engineer at Rackspace.
We all love a silver bullet. Over the last few years, if I were to split the outcomes that I see with Rackspace customers who start using SSDs, the majority of the outcomes fall under two scenarios. The first scenario is a silver bullet—adding SSDs creates near-miraculous performance improvements. The second scenario (the most common) is typically a case of the bullet being fired at the wrong target—the results fall well short of expectations.
With the second scenario, the file system, data stores, and processes frequently become destabilized. These demoralizing results, however, usually occur when customers are trying to speed up the wrong thing.
A common phenomena at the heart of the disappointing SSD outcomes is serialization. Despite the fact that most servers have parallel processors (e.g. multicore, multi-socket), parallel memory systems (e.g. NUMA, multi-channel memory controllers), parallel storage systems (e.g. disk striping, NAND), and multithreaded software, transactions still must happen in a certain order. For some parts of your software and system design, processing goes step by step. Step 1. Then step 2. Then step 3. That’s serialization.
And just because some parts of your software or systems are inherently parallel doesn’t mean that those parts aren’t serialized behind other parts. Some systems may be capable of receiving and processing thousands of discrete requests simultaneously in one part, only to wait behind some other, serialized part. Software developers and systems architects have dealt with this in a variety of ways. Multi-tier web architecture was conceived, in part, to deal with this problem. More recently, database sharding also helps to address this problem. But making some parts of a system parallel doesn’t mean all parts are parallel. And some things, even after being explicitly enhanced (and marketed) for parallelism, still contain some elements of serialization.
How far back does this problem go? It has been with us in computing since the inception of parallel computing, going back at least as far as the 1960s(1). Over the last ten years, exceptional improvements have been made in parallel memory systems, distributed database and storage systems, multicore CPUs, GPUs, and so on. The improvements often follow after the introduction of a new innovation in hardware. So, with SSDs, we’re peering at the same basic problem through a new lens. And improvements haven’t just focused on improving the SSD, itself. Our whole conception of storage software stacks is changing, along with it. But, as you’ll see later, even if we made the whole storage stack thousands of times faster than it is today, serialization will still be a problem. We’re always finding ways to deal with the issue, but rarely can we make it go away.