Todd Hoff's picture

Welcome to High Scalability

We started High Scalability to help you build successful scalable websites. This site tries to bring together all the lore, art, science, practice, and experience of building scalable websites into one place so you can learn how to build your system with confidence. Hopefully this site will move you further and faster along the learning curve of success. Please Start Here.

Todd Hoff's picture

It Must be Crap on Relational Dabases Week

It's hard to be a relational database lately. After years of faithful service everywhere you look the world is turning against you:

  • Recently at the NoSQL conference 150 revolutionaries met with their new anti-RDBMS arms suppliers. And you know what happens when revolutionaries are motivated, educated, funded, and well armed.
  • The revolution has gone mainstream when Computerworld writes No to SQL? Anti-database movement gains steam. It's not just whispers anymore, it's everywhere.
  • And perennial revolutionary Michael Stonebraker runs from blog to blog shouting the The End of a DBMS Era (Might be Upon Us). Relational vendors are selling legacy software, are 50x slower than other alternatives, and that can not stand.
  • The Greek Chorus on Hacker News sings of anger and lies.

    Certainly some say stick with the past. It's your fault, you aren't doing it right, give us another chance and all will be as it ever was. Some smirk saying this is nothing but a return to a more ancient time when IBM was King.

    But it's in the air. It's in the code. A revolution is coming. To what? That is what is not yet clear.

    See also:

  • NoSQL? by Curt Monash.
  • CouchDB says BigTable clones are too complex.
  • Yahoo! Developer Network Blog. Very nice summary of the different talks.

  • Todd Hoff's picture

    Product: Hbase

    Update 3: Presentation from the NoSQL Conference: slides, video.
    Update 2: Jim Wilson helps with the Understanding HBase and BigTable by explaining them from a "conceptual standpoint."
    Update: InfoQ interview: HBase Leads Discuss Hadoop, BigTable and Distributed Databases. "MapReduce (both Google's and Hadoop's) is ideal for processing huge amounts of data with sizes that would not fit in a traditional database. Neither is appropriate for transaction/single request processing."

    Hbase is the open source answer to BigTable, Google's highly scalable distributed database. It is built on top of Hadoop (product), which implements functionality similar to Google's GFS and Map/Reduce systems.

    Todd Hoff's picture

    Hypertable is a New BigTable Clone that Runs on HDFS or KFS

    Update 3: Presentation from the NoSQL conference: slides, video 1, video 2.
    Update 2: The folks at Hypertable would like you to know that Hypertable is now officially sponsored by Baidu, China’s Leading Search Engine. As a sponsor of Hypertable, Baidu has committed an industrious team of engineers, numerous servers, and support
    resources to improve the quality and development of the open source technology.

    Update: InfoQ interview on Hypertable Lead Discusses Hadoop and Distributed Databases. Hypertable differs from HBase in that it is a higher performance implementation of Bigtable.

    Skrentablog gives the heads up on Hypertable, Zvents' open-source BigTable clone. It's written in C++ and can run on top of either HDFS or KFS. Performance looks encouraging at 28M rows of data inserted at a per-node write rate of 7mb/sec.

    Todd Hoff's picture

    Product: Facebook's Cassandra - A Massive Distributed Store

    Update 2: Presentation from the NoSQL conference: slides, video.
    Update: Why you won't be building your killer app on a distributed hash table by Jonathan Ellis. Why I think Cassandra is the most promising of the open-source distributed databases --you get a relatively rich data model and a distribution model that supports efficient range queries. These are not things that can be grafted on top of a simpler DHT foundation, so Cassandra will be useful for a wider variety of applications.

    James Hamilton has published a thorough summary of Facebook's Cassandra, another scalable key-value store for your perusal. It's open source and is described as a "BigTable data model running on a Dynamo-like infrastructure." Cassandra is used in Facebook as an email search system containing 25TB and over 100m mailboxes.

  • Google Code for Cassandra - A Structured Storage System on a P2P Network
  • SIGMOD 2008 Presentation.
  • Video Presentation at Facebook
  • Facebook Engineering Blog for Cassandra
  • Anti-RDBMS: A list of distributed key-value stores
  • Facebook Cassandra Architecture and Design by James Hamilton

  • Todd Hoff's picture

    Product: Project Voldemort - A Distributed Database

    Update: Presentation from the NoSQL conference: slides, video 1, video 2.

    Project Voldemort is an open source implementation of the basic parts of Dynamo (Amazon’s Highly Available Key-value Store) distributed key-value storage system. LinkedIn is using it in their production environment for "certain high-scalability storage problems where simple functional partitioning is not sufficient."

    Todd Hoff's picture

    Anti-RDBMS: A list of distributed key-value stores

    Update 2: They are now called NoSQL databases. So keep up! Eric Lai wrote a good article in Computerworld No to SQL? Anti-database movement gains steam about the phenomena. There was even a NoSQL conference. It was unfortunately full by the time I wanted to sign up, but there are presentations by all the major players. Nice Hacker News thread too.
    Update: Some Notes on Distributed Key Stores by Leonard Lin. What's the best way to handle a fast growing system with 100M items that requires low latency and lots of inserts? Leanord takes a trip through several competing systems. The winner was: Tokyo Cabinet.

    Richard Jones has put together a very nice list of various key-value stores around the internets. The list includes: Project Voldemort, Ringo, Scalaris, Kai, Dynomite, MemcacheDB, ThruDB, CouchDB, Cassandra, HBase, and Hypertable. Richard also includes some commentary and their basic components (language, fault tolerance, persistence, client protocol, data model, docs, community).

    There's an excellent discussion in the comments of Paxos vs Vector Clock techniques for synchronizing writes in the face of network failures.

    Podcast about Facebook's Cassandra Project and the New Wave of Distributed Databases

    In this podcast, we interview Jonathan Ellis about how Facebook's open sourced Cassandra Project took lessons learned from Amazon's Dynamo and Google's BigTable to tackle the difficult problem of building a highly scalable, always available, distributed data store.

    Todd Hoff's picture

    Latency is Everywhere and it Costs You Sales - How to Crush it

    Update 5: Shopzilla's Site Redo - You Get What You Measure. At the Velocity conference Phil Dixon, from Shopzilla, presented data showing a 5 second speed up resulted in a 25% increase in page views, a 10% increase in revenue, a 50% reduction in hardware, and a 120% increase traffic from Google. Built a new service oriented Java based stack. Keep it simple. Quality is a design decision. Obsessively easure everything. Used agile and built the site one page at a time to get feedback. Use proxies to incrementally expose users to new pages for A/B testing. Oracle Coherence Grid for caching. 1.5 second page load SLA. 650ms server side SLA. Make 30 parallel calls on server. 100 million requests a day. SLAs measure 95th percentile, averages not useful. Little things make a big difference.
    Update 4: Slow Pages Lose Users. At the Velocity Conference Jake Brutlag (Google Search) and Eric Schurman (Microsoft Bing) presented study data showing delays under half a second impact business metrics and delay costs increase over time and persist. Page weight not key. Progressive rendering helps a lot.
    Update 3: Nati Shalom's Take on this article. Lots of good stuff on designing architectures for latency minimization.
    Update 2: Why Latency Lags Bandwidth, and What it Means to Computing by David Patterson. Reasons: Moore's Law helps BW more than latency; Distance limits latency; Bandwidth easier to sell; Latency help BW, but not vice versa; Bandwidth hurts latency; OS overhead hurts latency more than BW. Three ways to cope: Caching, Replication, Prediction. We haven't talked about prediction. Games use prediction, i.e, project where a character will go, but it's not a strategy much used in websites.
    Update: Efficient data transfer through zero copy. Copying data kills. This excellent article explains the path data takes through the OS and how to reduce the number of copies to the big zero.

    Latency matters. Amazon found every 100ms of latency cost them 1% in sales. Google found an extra .5 seconds in search page generation time dropped traffic by 20%. A broker could lose $4 million in revenues per millisecond if their electronic trading platform is 5 milliseconds behind the competition.

    Todd Hoff's picture

    Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines

    You might think major Internet companies have a latency, availability, and bandwidth advantage because they can afford expensive dedicated point-to-point private line networks between their data centers. And you would be right. It's a great advantage. Or it at least it was a great advantage. Cost is the great equalizer and companies are now scrambling for ways to cut costs. Many of the most recognizable Internet companies are moving to IP VPNs (Virtual Private Networks) as a much cheaper alternative to private lines. This is a strategy you can effectively use too.

    This trend has historical precedent in the data center. In the same way leading edge companies moved early to virtualize their data centers, leading edge companies are now virtualizing their networks using IP VPNs to build inexpensive private networks over a shared public network. In kindergarten we learned sharing was polite, it turns out sharing can also save a lot of money in both the data center and on the network.

    The line of reasoning for adopting IP VPNs goes something like this:

    Syndicate content