Hot Scalability Links for June 16, 2010

  • You're Doing it Wrong by Poul-Henning Kamp. Don't look so guilty, he's not talking about you know what, he's talking about writing high-performance server programs: Not just wrong as in not perfect, but wrong as in wasting half, or more, of your performance. What good is an O(log2(n)) algorithm if those operations cause page faults and slow disk operations? For most relevant datasets an O(n) or even an O(n^2) algorithm, which avoids page faults, will run circles around it.
  • A Microsoft Windows Azure primer: the basics by Peter Bright. Nice article explaining the basics of Azure and how it compares to Google and Amazon.
  • A call to change the name from NoSQL to Postmodern Databases. Interesting idea, but the problem is the same one I have for Postmodern Art, when is it? I always feel like I'm in the post-post modern period, yet for art it's really in the early 1900s. Let's save future developers from this existential time crisis.
  • Constructions from Dots and Lines by Marko A. Rodriguez, Peter Neubauer. Delightful yet in-depth explanation of the complex world of graph data structures. To make use of the graphs beyond simply representing their explicit structure, graph traversal frameworks and algorithms have been developed in order to shape graphs by driving the evolution of the entities that they model—e.g. humans and their relationships to one another and the objects of their world
  • Scaling the Social Graph in the Cloud using InfiniteGraph by Lead Architect Darren Wood. This was the talk he gave at Gluecon and was good intro to their product and the challenges of distributing graph data across more than one node.
  • The Art of Scalability - DPC10 wrapup by Lorenzo Alberton. A wrapup of the Dutch PHP Conference 2010.
  • What’s The Secret Behind Diapers.com Success? A Kiva Robot Warehouse by Aaron Saenz. Wow, using robots to build automated warehouses. By the end of 2010 there will be 40k products and 100k by the end of 2011. All at prices up to 25% less than at neighborhood stores. In my more luddite moments I have to hope robots can afford to by those products too.
  • The Resurgence of Parallelism by Peter J. Denning, Jack B. Dennis. Parallelism is not new; the realization that it is essential for continued progress in high-performance computing is. Parallelism is not yet a paradigm, but may become so if enough people adopt it as the standard practice and standard way of thinking about computation.
  • Videos From Streaming Media East and CDN Summit, Now Online by Dan Rayburn. I know a lot of people are into video if that's your focus there will be something for you here.
  • Hive - A Petabyte Scale Data Warehouse. Presentation given at ICDE 2010 on how Facebook uses Hive in their data warehouse.
  • Introduction to the Google App Engine DataStore by Luca Masini. A nice tight overview. We are managed to think "relational" because the winning technology in the last 35 years is the Relational Database Management System. Often we think that everything can be well designed with the relational model, but this may be not true, just think the effort we need to do every time we map our Java objects, also with modern ORMs.
  • Namecast - Welcome to the self-healing network. Yours.
  • Schooner - Memcached/NoSQL Appliance.

NorthScale - Get started with NorthScale Memcached  Server today!

If you would like to advertise a product, job, or event, please contact us for more information.