Hot Scalability Links for February 24, 2010

  • Cassandra @ Twitter: An Interview with Ryan King. Great interview by Alex Popescu on Twitter's thought process for switching to Cassandra. Twitter chose Cassandra because it had more big system features out of the box. Is that Cassandra FTW?
  • I Had Downtime Today. Here’s What I’m Doing About It by Patrick McKenzie. Awesome deep dive into went wrong with Bingo Card Creator. Sh*t happens. How do you design a process to help prevent it from happening and how do you deal with problems with integrity when they do?
  • High Availability Principle : Request Queueing by Ashish Soni. Queue request to ride out traffic spikes: 1) Request Queuing allows your system to operate at optimal throughput. 2) Your users only experience linear degradation versus exponential degradation. 3) Your system experiences NO degradation.
  • pfffft twatter tweeter by Knowbuddy. The reason you should care [about NoSQL] is because now you have more options--you're not stuck trying to wedge your system into a relational model if you don't want to. And isn't /. all about freedom of choice?
  • Wordpress, Varnish and Edge Side Includes. Using Varnish to go from .63 requests per second to 537.44 requests per second.
  • Facebook’s Petabyte Scale Data Warehouse using Hive and Hadoop by Ashish Thusoo and Namit Jain. How does Facebook deal with 12 TB of compressed new data everyday? They get a bad case of the Hives.
  • Click to read more ...


    Sponsored Post: Job Openings - Squarespace

    There was a bit of drama earlier when I posted a free job opening for Zynga. It caused unfortunate and just plain wrong accusations. It also caused a number of requests for more free job posts, which I should have anticipated, but obviously I can't let this blog become cluttered with that kind of stuff. Earlier I tried a job board type service, but that never really worked out. So what to do? Someone suggested a sponsored post approach and I think that's a good compromise. It minimizes the noise, let's people know about work, and brings in a little revenue. It works like an advertisement. If you are interested please let me know. When we have any job openings there will be a sponsored post like this one, that you can easily ignore or pay attention to, depending on your situation.

    Squarespace Looking for Full-time Scaling Expert

    Interested in helping a cutting-edge, high-growth startup scale? Squarespace, which was profiled here last year in Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month and also hosts this blog, is currently in the market for a crack scalability engineer to help build out its cloud infrastructure. Squarespace is very excited about finding a full-time scaling expert.

    Interested applicants should go to for more information.


    When to migrate your database?

    Why migrate your database? Efficiency and availability problems are harming your business – reports are out of date, your batch processing window is nearing its limits, outages (unplanned/planned) frequently halt work. Database consolidation – remove the costs that result from a heterogeneous database environment (DBAs time, database vendor pricing, database versions, hardware, OSs, patches, upgrades etc.). OK, so the driving forces for migration are clear,  what now?



    Twitter’s Plan to Analyze 100 Billion Tweets

    If Twitter is the “nervous system of the web” as some people think, then what is the brain that makes sense of all those signals (tweets) from the nervous system? That brain is the Twitter Analytics System and Kevin Weil, as Analytics Lead at Twitter, is the homunculus within in charge of figuring out what those over 100 billion tweets (approximately the number of neurons in the human brain) mean.

    Twitter has only 10% of the expected 100 billion tweets now, but a good brain always plans ahead. Kevin gave a talk, Hadoop and Protocol Buffers at Twitter, at the Hadoop Meetup, explaining how Twitter plans to use all that data to an answer key business questions.

    What type of questions is Twitter interested in answering? Questions that help them better understand Twitter. Questions like:

    Click to read more ...


    Seven Signs You May Need a NoSQL Database

    While exploring deep into some dusty old library stacks, I dug up Nostradamus' long lost NoSQL codex. What are the chances? Strangely, it also gave the plot to the next Dan Brown novel, but I left that out for reasons of sanity. About NoSQL, here is what Nosty (his friends call him Nosty) predicted are the signs you may need a NoSQL database...

    Click to read more ...


    Scaling Ambition at StackOverflow

    Joel Spolsky and Jeff Atwood are raising VC money for StackOverflow. This is interesting for three reasons: 1) Joel has always seemed like a keep it small and grow organically type of guy, so this is a big step in a different direction. 2) It means they think there's a very big market in the Q&A space and they mean to capture as much as the market as possible. 3) Most importantly for this blog, Joel gives some good advice on when to stay fresh and local and when it's time to jump for the brass ring, scale up your ambition, and go for VC money. Please see Joel's blog post for the details, but here's when to go VC:

    Click to read more ...


    The Amazing Collective Compute Power of the Ambient Cloud

    This is an excerpt from my article Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud.

    Earlier we talked about how a single botnet could harness more compute power than our largest super computers. Well, that's just the start of it. The amount of computer power available to the Ambient Cloud will be truly astounding.

    2 Billion Personal Computers

    Click to read more ...


    Hot Scalability Links for February 12, 2010

    1. My Life With Hbase by Lars George. The hardscabble tale of Hbase's growth from infancy to maturity. A very good introduction and overview of Hbase.
    2. NoSQL Alternatives -- Common Principles and Patterns for Building Scalable Applications. Explore the common principles behind the major NOSQL alternatives and how they compared with traditional database approach in terms of consistency, transaction and query semantics. We will also explore how we can make the transition between the two models smoothers through the support of standard interfaces such as JPA.
    3. Moore’s Law: The Future of Cloud Computing from the Bottom Up. Will Intel's 48 mega core chip change the world or be just another Spruce Goose?
    4. Rent or Own: Amazon EC2 vs. Colocation Comparison for Hadoop Clusters. It's much cheaper to own when you have a large relatively fixed size cluster and can find really cheap labor to maintain it all.
    5. A cloud in a plug - brilliant. A tiny, low-power, low-cost home server and NAS device powered by Tonido software that allows you to access your apps, files, music and media from anywhere.
    6. Seeking A Database That Doesn't Suck by Pixy Misa. Quick recap of databases that suck - or at least, suck for my purposes - and some that I'm still investigating.

    Click to read more ...


    ElasticSearch - Open Source, Distributed, RESTful Search Engine

    ElasticSearch is an open source, distributed, RESTful search engine built on top of Lucene. Its features include:

    • Distributed and Highly Available Search Engine.
      • Each index is fully sharded with a configurable number of shards.
      • Each shard can have zero or more replicas.
      • Read / Search operations performed on either replica shard.

    Click to read more ...


    How FarmVille Scales to Harvest 75 Million Players a Month

    Several readers had follow-up questions in response to this article. Luke's responses can be found in How FarmVille Scales - The Follow-up.

    If real farming was as comforting as it is in Zynga's mega-hit Farmville then my family would have probably never left those harsh North Dakota winters. None of the scary bedtime stories my Grandma used to tell about farming are true in FarmVille. Farmers make money, plants grow, and animals never visit the red barn. I guess it's just that keep-your-shoes-clean back-to-the-land charm that has helped make FarmVille the "largest game in the world" in such an astonishingly short time.

    How did FarmVille scale a web application to handle 75 million players a month? Fortunately FarmVille's Luke Rajlich has agreed to let us in on a few their challenges and secrets. Here's what Luke has to say...

    Click to read more ...