Stuff The Internet Says On Scalability For January 6, 2012

OMG, it's 2012:

  • Harry Bombarda Twilight; 200 Million: Chinese online shoppers; Quantum 150 qubit computer: all the power of today's supercomputers;  Sperm: two aspirins worth could repopulate the world; 1 Billion: the number of iOS and Android apps downloaded in a week; Watson: 250 Servers, 2,880 cores, 10 racks, 16 Terabytes RAM, 80 Teraflops; Reddit: 2 Billion Pageviews
  • Quotable Quotes:
    • Robert Martin : The hallmark of a really good architecture is that it allows major decisions to be deferred. 
    • Building Memory-efficient Java Applications: Practices and Challenges : More abstractions = less awareness of costs.
    • Ian Muir : When we do something that Microsoft did not anticipate, it's nothing but pain.
    • @kekline : Want to know a secret - NoSQL's rapid growth is really about NoNormalization
    • Jeremy Zawodny : The fact that I can look back on code I wrote a few years ago and identify ways that I’d do it better is good. It means I’m still learning. But the fact that I can successfully resist the urge to change the code is even better.
    • John Boyd : people first, ideas second, hardware last.
  • So cool: Glowing bacteria biopixels: The sensor displays of the future. Bacteria talk to each other using quorum sensing, which means they talk using molecules. This doesn't scale to millions of bacteria. The solution: a microfluidic chips were designed to harness the localized trigger and broadcast it to the plethora of shared colonies existing on the chip. Each of the bacteria cells on the microfluidic chip is called a “biopixel." The future of sensing technology is going to be in living sensors.
  • The secret to Watson's success: Massive parallelism; Many experts; Pervasive confidence estimation; Integrate shallow and deep knowledge; developed using Apache UIMA framework.
  • Surge 2011 Conference videos are now online. Looks like there are some good vids, if you have the urge to Surge.
  • Ben Stopford with a great set of Interesting Links Dec 2011. Intel is breaking the teraflop boundary, mobile is giving RISC new life, easier FPGA programming,  large address spaces reshaping the landscape, lots on efficient Java memory usage, and much more.
  • A GAE Trifecta: High Replication Datastore: 1 year, 100,000 apps, 0% downtime. John Wheeler Love's AppEngine for the power and the programming model: all the parallel computing, task queues, and async datastore fetches. He says "it's been one of the funnest programming and learning experiences of my career." Another user is not so happy that the Cost of mapreduce was $6,500 to update a ListProperty on 14.1 million entities because "they now charge per write and our list property, which has about 18 values per entity, is indexed."
  • Strange Loop 2011 videos are are online. They also have a strong set of videos to learn from.
  • Building Memory-efficient Java Applications: Practices and Challenges. Timely for a problem I'm facing, this is an epic expose on Java memory usage. It is easy for costs to pile up, just piecing together building blocks. Developers expected 2K and found 200K session state per user.
  • Hadoop for Archiving Email. Sunil Sitaula with a good description of how to use Solr/Lucene to index email on top of Hadoop.
  • Ayende @ Rahien weighs in on On Infinite Scalability: always assume an order of magnitude increase in the work the system has to do rather than plan for infinite scalability from the start. An OOM change is relatively easy to attain without incurring too great a cost. You can't do it right the first time and time to market is a feature.
  • A Common Database Approach for OLTP and OLAP Usingan In-Memory Column Database. A no compromise approach saying column stores are the best approach to implementing OLTP systems: the best way to use modern CPUs is to not even provide a primary key index, but use the full column scan instead, given that 2.5 million tuples can be scanned in 1ms; in-memory store improved update performance; minimizes locks and memory consumption; reduce code size by 30%.
  • Megatrend: Cheap RAM Reshaping All of Computing. Andrew Binstock says you can now buy a 64-core server with 512GB of RAM for less than $30K and that these  massive amounts of RAM in servers is changing everything. 
  • Schism: a Workload-Driven Approach to Database Replication and Partitioning. A novel workload-aware approach for database partitioning and replication designed to improve scalability of shared nothing distributed databases.
  • Virtual networks can run Cassandra up to 60% faster.
  • En Fuego: 
    • To Understand Refrigeration is to Understand the World: Mapping the world's man made cold spaces, rise of meat and bananas, decline of tuna and other cold things. 
    • Management Wisdom of Battlestar Galactica: one thing without which you have nothing, do the human thing, debate and fight for an idea, expect to pay for your sins, gamble big and take a risk, so say we all. 
    • A Brief History of Media: Media .1 - cave paintings; Media .5 - papyrus; Media .9 - actual books; Media 1.0 - printing press; Media 1.1 - political pamphlets of the US revolution; Media 1.5 - telegraph; Media 1.9 - newspaper/magazine; Media 2.0 - radio; Media 2.5 - TV; Media 2.9 - cable; Media 3.0 - digital; News Media 3.5 - Consumers -> Creators -> Collaborators; Media 3.6 - Collaboration + Real-Time Data; Media 4.0 - worry that copyright cartel will take control into a few hands. Should scare the crap out of you.