Strategy: Get Servers for Free and Make Users Happy by Turning on Compression

Edward Capriolo has a really interesting article on his dramatic performance expanding experience of turning on compression for Cassandra. The idea:

  • Enabling compression shrunk 71GB of data down to  31GB, which caused more data to fit in RAM, which reduced disk IO to nearly nothing.
  • Compression means more data can be stored, which is like buying more machines without having to spend more money.
  • Compression means serving more data out of RAM, which means clients are happier because of the performance improvements.
  • The cost is higher CPU usage to perform the encrypt/decrypt. But disk IO is orders of magnitude slower than decompression and most servers have CPU to burn.

Edward's article is well written, has the specifics on how to turn on compression for Cassandra, pretty graphs, and lots more details.