Robert Haas in his SURGE Recap of the Surge conference, reflected a bit, and came up with an interesting checklist of general themes from what he was seeing. I'm directly quoting his post, so please see the post for a full discussion. He uses this framework to think about the larger picture and where PostgreSQL stands in its progression.
- Make use of the academic literature. Inventing your own way to do something is fine, but at least consider the possibility that someone smarter than you has thought about this problem before.
- Failures are inevitable, so plan for them. Try to minimize the possibility of cascading failures, and plan in advance how you can operate in degraded mode if disaster (or the Slashdot effect) strikes.
- Disk technology matters. Drive firmware bugs are common and nightmarish, and you can expect very limited help from the manufacturer, especially if the drive is billed as consumer-grade rather than enterprise-grade. SSDs can save you a lot of money, both because a given number of dollars buys more IOs-per-second, and because electricity isn't free.
- Large data sets require horizontal scalability. In the era of 1TB drives, "large" doesn't mean quite what it used to, but even though the amount of data you can manage with one machine is growing all the time, the amount of data people want to manage is growing even faster.