Hot Scalability Links For Sep 17, 2010

  • Disqus - Scaling the Worlds Largest Django App. Interesting overview of a commenting system with 75 million comments and 250 million visitors. Lots of good details on how they partition their database, testing, continuous integration, feature switches, caching, delayed signals, and more.
  • Things I learnt tracking a billion events in 24 hours: Know your host, Scaling isn't just servers, My servers need to talk to me more, Kill switches for users, What you don't know is the problem, Don't mix server roles, Know your most important users outside of your site.
  • Tweets of Gold:
    • georgebarnett: I read High Scalability for useful articles about large scaling. Sadly though, nothing useful ever shows up. #NoLongerBothering
    • northscale: wow that is fast! :) RT @cgoldberg: was just running > 100k ops/sec against my 2-node #Membase cluster... zazooom #nosql
    • turbofunctor: The root of many (horizontal) scalability problems is an application level access to a writable filesystem. (Thus, #appengine.)
    • gwenshap: highscalability.com is like Vogue for IT operations. Map-reduce is so last season.
  • Distributed Systems: scalability and high availability. A very nice slide deck by Renato Lucindo on Scalability, High Availability, Problems, and Tips & Tricks. 
  • Applying Scalability Patterns to Infrastructure Architecture. Lori MacVittie explains how DevOps can help with the practical implementation of scalable systems.
  • Scaling the BBC iPlayer to handle demand by Simon Frost. The site will have to support a massive amount of page views and users every day, on average 8 million a day for 1.3 million users. Some of their strategies: We proved our architecture before we built it, We cache a lot, We broke the page into personalised and standard components, We use loads of servers, We load tested the site before we launched.
  • Cool very safe for work cartoon on How Scalr Works. Maybe the basis for a new Batman reboot?
  • Machine learning on top of GFS at Google by Greg Linden. That pesky network is a scarce, shared resource, and it often takes a network brownout to remind us that virtual machines are not all it takes to get everyone playing nice.
  • On the Complexity of Processing Massive, Unordered, Distributed Data. An existing approach for dealing with massive data sets is to stream over the input in few passes and perform computations with sublinear resources. This method does not work for truly massive data where even making a single pass over the data with a processor is prohibitive.
  • Cloudtop Applications by Anil Dash. One interesting pattern I've noticed popping up around my favorite new apps these days is that they follow what I'd call a "cloudtop" design
  • Photoshop Scalability: Keeping It Simple. Excellent interview on the issues around exploiting multiple processors in a real and complex application. Photoshop's parallelism, born in the era of specialized expansion cards, has managed to scale well for the two- and four-core machines that have emerged over the past decade. As Photoshop's engineers prepare for the eight- and 16-core machines that are coming, however, they have started to encounter more and more scaling problems, primarily a result of the effects of Amdahl's law and memory-bandwidth limitations.