Stuff The Internet Says On Scalability For August 12, 2011

Submitted for your scaling pleasure, you may not  scale often, but when you scale, please drink us:

  • Quotably quotable quotes:
    • @mardix : There is no single point of truth in #NoSQL . #Consistency is no longer global, it's relative to the one accessing it. #Scalability
    • @kekline : RT @CurtMonash: "...from industry figures, Basho/Riak is our third-biggest competitor." How often do you encounter them? "Never have" #nosql
    • @dave_jacobs : Love being in a city where I can overhear a convo about Heroku scalability while doing deadlifts. #ahsanfrancisco
    • @satheeshilu : Doctor at #hospital in india says #ge #healthcare software is slow to handle 100K X-rays an year.Scalability is critical 4 Indian #software
    • @sufw : How can it be possible that Tagged has 80m users and I have *never* heard of it!?!
    • @EventCloudPro : One of my vacation realizations? Whole #bigdata thing has turned into a lotta #bighype - many distinct issues & nothing to do w/ #bigdata
  • NoSQL as dynamic duos. NoSQL combinations - what works best? A common pattern seems to be Redis as a cache and Riak as the distributed backend.
  • Coping With Inconsistent Databases. Victor Nicollet on how to prevent eventual consistency from locking in permanent errors. «On EVENT apply CHANGE» vs «If STATE-A then STATE-B»  logic. Events are lossy, which introduces an irreparable inconsistency as well, so I don't buy the argument completely, but with a good discussion. 
  • Tuning XenServer for Maximum Scalability. Nicholas Rintalan focuses on single server scalability, tuning XenServer's control domain: change the number of CPUs allocated to dom0, increase memory for dom0, increase heap size, install irqbalance.
  • VoltDB describes how they recover from events most catastrophic. Distributed databases need to figure this sort of stuff out and it's interesting to see the choices made. 
  • Database best practices for future scalability. Flamingcow with some good and simple advice on MySQL schema design: Use InnoDB, use MySQL for relational data only, eschew hierarchies, Use BIGINT UNSIGNED NOT NULL, Use BIGINT instead of INT for all keys, use ORM, eschew triggers and stored procedures, prefer joins over subselects, eschew views, minimize network roundtrips.
  • Jeremiah Peschka  with a funny take on The Stages of Growth modeled on the stages of grief. Now technically speaking people don't go through these stages in order or at all, but I think you'll accept the advice after a bit of anger driven depression that can't be solved by bargaining, don't deny it.
  • Akamai runs a good chunk of the Internet and they rarely go down, yet even they have outages. Amazon had some issues too.
  • Andy Firth lamenting on The demise of the low level Programmer. Real men use a line editor by fluctuating a magnetic field using bar magnets. 
  • Things that could change the world: Nanodiamond transistors and house-sized computers are coming. Building transistors and logic gates from thin films of nanodiamond using standard semiconductor fabrication techniques. 
  • Finagle - A fault tolerant, protocol-agnostic network client and server. Finagle is an asynchronous network stack for the JVM that you can use to build asynchronous Remote Procedure Call (RPC) clients and servers in Java, Scala, or any JVM-hosted language. Finagle provides a rich set of tools that are protocol independent.
  • blitz.io: How we use Heroku, AWS and CouchDB.  run multiple m1.Small instances across all five AWS regions; evented C++ can generate over 50,000 concurrent HTTP requests on a single EC2 instance!; use cloudinit and the EC2 user-data to boot;  sinatra/ruby on the web tier; use background workers; CouchDB is the backbone, run cluster across the virginia and california, multi-master replication. 
  • Davy Suvee with very cool uses of algorithms using NoSQL: Molecular similarity theory, MongoDB datamodel,  MongoDB molecular similarity query.
  • Slidedecks of the Keynote Speakers from ECOOP 2011
  • Programming and Scaling. Alan Kay covers a lot of interesting territory. Not much on scaling, but that won't matter. DSLs are kay, er, key.
  • Etsy open sources their deployment tool: Deployinator.  Deployinator is a one button web-based deployment app. Hit that button and code goes to our webservers and is serving requests in almost no time. Video. HackerNews
  • Packet Pushers tell you how to get your hosts hardened to protect internet-facing apps.
  • 1024cores with some updates: Relacy Race Detector 2.4, ThreadSanitizerLocation-Based Memory Fences
  • Storage is Cheap, Don't Mutate - Copy on Write. For read heavy use cases duplicate nodes and relationships on writes to remove blocking, simplify caching, easier conflict resolution, and being on the right side of history.
  • Wordnik has some API Swagger, as they say on So You Think You Can Dance. Swagger is a specification and complete framework implementation for describing, producing, consuming, and visualizing RESTful web services. It's an effort to converge the hugely disparate APIs out there. REST needs to wakeup and have a way to describe and document APIs. A Hacker News poll shows APIs are a pain point.
  • Do you need a challenge? Is your brain on cruise control? Why Philosophers Should Care About Computational Complexity by Scott Aaronson could be the remedy: I argue that computational complexity theory---the field that studies the resources (such as time, space, and randomness) needed to solve computational problems---leads to new perspectives on the nature of mathematical knowledge, the strong AI debate, computationalism, the problem of logical omniscience, Hume's problem of induction and Goodman's grue riddle, the foundations of quantum mechanics, economic rationality, closed timelike curves, and several other topics of philosophical interest. I end by discussing aspects of complexity theory itself that could benefit from philosophical analysis.