Stuff The Internet Says On Scalability For April 15, 2011

Submitted for your reading pleasure...

Luxury is an ancient notion.  There was once a Chinese mandarin who had himself wakened three times every morning simply for the pleasure of being told it was not yet time to get up.  ~Argosy

  • We have a Qutoable Quote machine for you today:
    • @kevinweil: Twitter monthly signups have increased more than 50% since December, and we're now doing well over 150 million Tweets per day.
    • @ChrisShain: Prediction: Black art of query optimization will become black art of #nosql data modeling, for same reasons. Minimize IOs, query time.
    • @ui_matters: Infrastructure as a Service = no hardware headaches. Platform as a Svc = no scalability headaches. SaaS = common dev platform #amchamtech
    • @plcstpierre: Thinking about high scalability stuff... I never thought database stuff can be interesting...
    • @webdz9r: mass scalability for dynamic web content. What took us 8 machines, now take us 1 web and 1 app.
    • @joelvarty: CDN is always an afterthought in these sessions. What a shame. CDN is THE methodology for scalability on a global scale. #MIX11
    • @toddrockoff: "Scalability": What exactly is it? How do you measure scalability? Is "scalability" a refuge for the architecturally desperate?
    • @hakmem: Since #NoSQL seems to have rebooted database history, I hope Codd reincarnates soon
    • @jatorre: So it turned our Google Maps also have scalability issues with custom tiles! styled tiles are 10X more expensive than the normal ones.
    • @Rae_7410: The issues of scalability: Can't invest in Pell Grants but can lock youths up for years. Educ. investment="not scalable". Say what?#aacc2011
    • @VanessaAlvarez1: What is unlimited scalability?
      • @nerdguru: Like computing integrals in calculus, you can approach it but never actually get there. 
      • @storagezombies: The distance across the known universe.
      • @nynorwegian: probably something God like, like the expanding universe :)
      • @dimonet: a bigger closet and a shoe sale?
  • Our scalability acheivement of the week: StumbleUpon Hits 1 Billion Stumbles Per Month
  • Twitter Search API - Questions Regarding Scaling Out Options. Use the streaming API and implement the filtering logic on yourside.  Understandably, Twitter doesn't want your complex query load.
  • He likes it, he likes it. I was really waiting for James Hamilton's take on the new open source hardware initiative by Facebook, and he delivered in Open Compute Mechanical System DesignThe most interesting aspects of the Facebook mechanical design: 1) full building ducting with huge plenum areas, 2) no-process based cooling, 3) mist-based evaporative cooling, 4) large, efficient impellers with variable frequency drive, and 4) full wall, low-resistance filtration.
  • Andy Oram with a good Wrap-up of 2011 MySQL Conference. Themese:  Mix your relational database with less formal solutions and move to the cloud.
  • SQL Performance on Growing System Load. Careful execution plan inspection gives more confidence than superficial benchmarks.
  • Anatomy of Google Analytics Cookies by Dennis Paagman. Nothing black hatty, but a fascinating drill down on all those Google cookies. 
  • BOOM has gone Alpha. If you are looking for potential next generation distributed programming paradigms, BOOM should be on sonically yours.
  • Dan Singerman with a great tutorial on denial of denial of service attacks in How to block rate-limited traffic with Varnish
  • Explore your advanced gitness The Fringes of Git. These 40 minutes are primarily live coding with a few diagrams for reference. We show you how Git reaches further than any other version control system to provide capabilities for both the novice and the master craftsperson.
  • Save open data and save the world. Please Help Recognize the Heroes Behind Big Data Projects Like Data.gov.
  • The Surge conference has opened up videos of their 2010 conference as a lead in to their 2011 session. A lot of good ones, take a look.
  • The wild west never had technology like this: Message Queue Shootout! Mike Hadlow investigates: MSMQ, ActiveMQ, RabbitMQ, and ZeroMQ. As you can see, there’s ZeroMQ and the others. Its performance is staggering
  • If you have small amounts of extremely important data, then Doozer is new consistent, highly-available data store you might want to take a look at. Use include: name service, database master elections, and configuration data shared between several machines. 
  • Ring-Paxos: A High-Throughput Atomic Broadcast Protocol. Prof. Fernando Pedone talks about an effficient atomic broadcast protocol, derived from Paxos, which can be used to implement state-machine replication. It can deliver nearly 1 Gbps of data to tens of servers in a local-area network.
  • The architecture of REDIS is explored by enjoyTheArchitecture. Looks like the site is trying to write useful quick hits on how various systems work and I think they succeeded.
  • Erlang vs Java memory architecture by Byron at Java Code Geeks. Two very different languages that work in two very different ways. Seeing their approaches compared is enlightening.  In summary, for every day performance I believe the private heap memory model to be a very powerful tool in Erlang's box. It cuts whole classes of locking mechanisms out of the run-time system and that means it will scale better than Java will for the same purpose. Java's hard limits on memory will save your bacon when your system is being flooded or DDoSed.
  • A great series from Deadcode lt teaching about GPUs: How the GPU works - part 1, Part 2Part 3. In this, and the following posts, I want to analyze the behaviour of a modern GPU (let's say, directX 9 class), going from the broad architecture to the shader execution pipeline to some specific shader optimization guidelines.
  • Kurt Monash is exploring a new space for Use cases for low-latency analytics. Analytics used to be large lumbering systems making sense of the past. We are seeing different niches pop up with different technologies to serve them.
  • Looking for real-life example of map-reduce? Take a look at Josh Patterson of Cloudera's tutorial  Simple Moving Average, Secondary Sort, and MapReduce.
  • Ricky Ho with an excellent survey Scalable System Design techniques. Includes general principles and common techniques.
  • Packet Pushers talking about OpenFlow. Let's program the network layer like we can program everything else. Like it.
  • Scaling a PHP MySQL Web Application, Part 1 
    By Eli White. Here's Part 2. A very comprehensive and nicely written trip the the key issues. Talks about: PHP Tuning, load balancing, master-slave replication, slave lag, multiple slaves, database pooling, and sharding.