Hot Scalability Links for February 12, 2010
- My Life With Hbase by Lars George. The hardscabble tale of Hbase's growth from infancy to maturity. A very good introduction and overview of Hbase.
- NoSQL Alternatives -- Common Principles and Patterns for Building Scalable Applications. Explore the common principles behind the major NOSQL alternatives and how they compared with traditional database approach in terms of consistency, transaction and query semantics. We will also explore how we can make the transition between the two models smoothers through the support of standard interfaces such as JPA.
- Moore’s Law: The Future of Cloud Computing from the Bottom Up. Will Intel's 48 mega core chip change the world or be just another Spruce Goose?
- Rent or Own: Amazon EC2 vs. Colocation Comparison for Hadoop Clusters. It's much cheaper to own when you have a large relatively fixed size cluster and can find really cheap labor to maintain it all.
- A cloud in a plug - brilliant. A tiny, low-power, low-cost home server and NAS device powered by Tonido software that allows you to access your apps, files, music and media from anywhere.
- Seeking A Database That Doesn't Suck by Pixy Misa. Quick recap of databases that suck - or at least, suck for my purposes - and some that I'm still investigating.
- NoSQL GraphDB by Ricky Ho. Excellent overview of how Graph Databases work.
- Versioning data in S3 on AWS. Royans explains S3's new snapshot capability.
- Real-Life Multithreading. An excellent article by Bartosz Milewski showing how he did a concurrency rewrite for a real-life software product called Code Co-op.
- Spinlocks and Read-Write Locks. Learn more about common low level concurrency constructs.
Product Announcements
- Lucandra: A Cassandra-based Lucene backend. Aims at using Cassandra to make a easier to deploy and maintain backend for Lucene.
- App Engine SDK 1.3.1, Including Major Improvements to Datastore!. A product often defined more by it's limits, GAE has just got a lot easier to use. GAE now has cursors and the 1000 item return limit has gone. While a much easier programming model it will be curious to see how this impacts scalability and resource usage. One of the canonical ideas of GAE has been to do work on writes to make reads fast, now programmers will be tempted to do more work on reads which will probably be slower and cost a lot more. Another addition is the automatic retries of datastore operations. Users should never have to retry, as only the lower layers know enough if a retry is necessary, so this makes a life a lot easier and will remove a ton of dead code from programs.
- Gear6 Memcached Service for the Cloud Now Available on GoGrid. An on-demand service allows GoGrid customers to take advantage of Gear6’s robust, commercial-grade Memcached offering to speed up, scale out and ensure uptime for their web services and applications.