IFJlYWwgV29ybGQgV2ViOiBQZXJmb3JtYW5jZSAmIFNjYWxhYmlsaXR5

We've referenced this 189 slide masterpiece by Ask Bjorn Hansen before, but it was hidden without its own first class link. He describes his presentation as 3 hours of 5 minute lightening talks and that sounds about right.

The presentation covers: overall platform and architecture considerations involved in tuning applications from a holistic perspective. You’ll be shown design scalable architectures for dynamic, high-volume web sites. Topics covered include caching, scalable database design, replication architecture, load-balancing, and architectural decisions derived from many years of experience.

His prime directive of scaling: Think Horizontally at every point in your architecture, not just at the web tier.

You may not agree with everything, but there's a lot of useful advice. Here's a summary of some of what is covered:

  • Benchmarking
  • Vertical scaling sucks.
  • Horizontal scaling rocks.
  • Run many application servers
  • Don't keep state in the app server
  • Be stateless
  • Optimization is necessary, but is different than scalability.
  • Cache things you hit all the time.
  • Measure, don't assume, check.
  • Make pages static.
  • Caching is a trade-off.
  • Cache full pages.
  • Cache partial pages.
  • Cache complex data.
  • MySQL query cache is flushed on update.
  • Cache invalidation is hard.
  • Replication scales reads, not writes.
  • Partition to scale writes. 96% of applications can skip this step.
  • Master-master setup facilitates on-line schema changes.
  • Create summary tables and summary databases rather than do COUNT and GROUP-BY at runtime.
  • Make code idempotent. If it fails you should just be able to run it again.
  • Load data asynchronously. Aggregate updates into batches.
  • Move processing to application and out of the database as much as possible.
  • Stored procedures are dangerous.
  • Add more memory.
  • Enable query logging and take a look at what your app is doing.
  • Run different MySQL instances for different work loads.
  • Config tuning helps, query tuning works.
  • Reconsider persistent DB connections.
  • Don't overwork the database. It's hard to scale.
  • Work in parallel.
  • Use a job queuing system.
  • Log http requests.
  • Use light processes for light tasks.
  • Build on APIs internally. Clean loosely coupled APIs are easy to scale.
  • Don't incur technical debt.
  • Automatically handle failures.
  • Make services that always work.
  • Load balancing is the key to horizontal scaling.
  • Redundancy is not load-balancing. Always have n+1 capacity.
  • Plan for disasters.
  • Make backups.
  • Keep software deployments easy.
  • Have everything scripted.
  • Monitor everything. Graph everything.
  • Run one service per server.
  • Don't ever swap memory for disk.
  • Run memcached if you have extra memory.
  • Use memory to save CPU or IO. Balance memory vs CPU vs IO.
  • Netboot your application servers.
  • There's lot of good slides on what to graph.
  • Use a CDN.
  • Use YSlow to find client side problems.

    This is just a high level blitz through the presentation. Topics are given a lot more detail in the presentation. Audio of Ask's dulcet tones would be nice, but there's still a lot to learn here.