Stuff The Internet Says On Scalability For May 27, 2011

Submitted for your scaling pleasure:

  • Good idea: Open The Index And Speed Up The Internet. SmugMug estimates 50% of their CPU is spent serving crawler robots. Having a common meta-data repository wouldn't prevent search engines from having their own special sauce. Then the problem becomes one of syncing data between repositories and processing change events. A generous soul could even offer a shared MapReduce service over the data. Now that would speed up the internet.
  • Scaling Achievements: YouTube Sees 3 Billion Views per Day; Twitter produces a sustained feed of 35 Mb per secondcompanies processing billions of APIs calls (Twitter, Netflix, Amazon, NPR, Google, Facebook, eBay, Bing); Astronomers Identify the Farthest Object Ever Observed, 13.14 Billion Light Years Away
  • Quotes that are Quotably Quotable:
    • eekygeeky: When cloud computing news is slow? Switch to "big data"-100% of the vaguery, none of the used-up, mushy marketing feel!
    • singhns: an API is like diamonds, a huge range of value based upon four c's - clarity, consistency, convenience and channel
    • dotmanish: "The sex appeal of company comes from the scalability" - Rajul at #opengurukul
  • Twitter improves the Ruby runtime 2.7x the speed of standard Ruby 1.8. Following in the tradition of Facebook making another interpreted language, PHP, fast enough.
  • Henry Robinson on Why did Google Megastore use Paxos instead of ZooKeeper's Atomic Broadcast (ZAB)? 
    • ZAB and Paxos have slightly different guarantees - in particular, ZAB requires a 'prefix-complete' network where receiving message i implies that you have already received messages 0 -> i-1. TCP connections have this property. Paxos is robust to ordering errors of the sort that UDP can give, which may have been a requirement for Megastore.
  • Really nice article on how Using Varnish So News Doesn’t Break Your Server. Jacom Harris explains how the NYTimes uses Varnish to soak up a spike of more than 300 requests per second with no ops stress.
  • If ActiveMQ is Not ready for prime time, what is? Home grown;  JBossMQ; HornetQ; RabbitMQ. Second life's Message Queue Evaluation Notes.
  • All the videos are available from Tech Crunch Disrupt.
  • Transactional memory - nothing but trouble. Dmitry Dvoinikov tells us that STM won't make parallel programming magically simple. What should we do? 
    • I do believe message passing is a better way to go. The technique goes under many names, but the idea is the same - there are separate processes with separate data and they interact only in a well-defined manner.
  • On HTTP Load Testing. Don't re-repeat history says Mark Nottingham by making the same load testing mistakes everyone else makes. You can learn how to make your own mistakes if you: test the same time, every time; generate load on a different machine than the test machine; check if your network has sufficient capacity; remove artificial OS limits; really test capacity by testing a progressively higher loads; run longer tests; run complex loads; get a full statistical panel, not just average; publish all data; test under different tools.
  • Gnip CEO says everyone is building special tools to deal with real-time BigData feeds like Twitter. Another problem is the network infrastructure can't able handle the volume. We need more tools Captain!
  • Is your network really slow? Network operators get calls from application guys saying the network is slow, but the problem is really dropped packets due to congestion. It's not usually latency, it's usually packet loss. Packet loss causes TCP to back off and retransmit, causing applications to get slow. Packet loss can be a flakey transceiver, but usually the problem is network congestion. Somewhere on the network there's fan-in, a bottleneck develops, the queue build up to a certain point, and when the queue overflow it drops packets. The first sign of this is application slowness. Queues get deeper and deeper because the network is getting more and more use over time. Get your switch to tell you about queue stats so you can take proactive action. Find out if your network is going to drop packets before it happens. If you are polling stats you can miss queue threshold crossing. You need alerts to be correct. Polling can't see a microburst. It doesn't take a lot of loss to cause problems. More on PacketPushers: Show 45 – Arista – EOS Network Software Architecture – Webinar
  • The Architecture of Open Source Applications.  Amy Brown and Greg Wilson explain the architecture behind twenty-five open source applications. Examples include: Berkely DB, Eclipse, Hadoop, LLVM, NoSQL, SnowFlock, and Riak. Very cool.
  • Theoretical Node.js Real time Performance. Arnout Kazemier finds To fill up the server with stabilized connections it could reach a total of 1.7gb / 4.025kb = 422.000 connections. 
  • Public Static Void. Rob Pike explains what made Go go. Lambda the Ultimate comments in the tone of tut tut.
  • Complex technologies improve more slowly. Researchers found that the greater a technology's complexity, the more slowly it changes and improves over time. Is it time to jump ship for a technology on a different growth curve?
  • Rx: Curing Your Asynchronous Programming Blues. Bart De Smet explains the design philosophy behind the reactive framework Rx, the combinators and operators defined by Rx, and the work in progress to integrate it with async. 
  • Google does a deal with Poseidon, uses sea water to cool their new data center. Video at Data Center Knowledge
  • We need modules to keep separate things separate. Erlang inventor disagrees in Why do we need modules at all? You'll start prefixing all names with a string so they don't clash. Then you'll notice all your related code uses these strings, what a waste, so you'll create a "module" in order to be able to drop the string in a defined context. Thus namespaces/modules are reinvented. Function drives form.
    you'll start prefixing all names with a string so they don't clash. Then you'll notice all your related code uses these strings, what a waste, so you'll create a "module" in order to be able to drop the string in a defined context. Thus namespaces/modules are reinvented. Function drives form.
  • Making our MongoDB Code Run Faster. Karl Seguin figures out how to improve performance 5x: reduce index memory; rename fields; remove indexes; move from  >= to ==.
  • Software Load Balancing using Software Defined Networking. James Hamilton with a good wrap up of Software Defined Networking: separating the networking control plane from the data plane. The goal is a fast and dumb routing engine with the control plane factored out and supporting an open programming platform.
  • Good discussion on StackOverflow and Reddit of Why do relational DBs work for you in practice?.. In theory, one quickly gets into composing a query that is effectively not answerable (the worst-case complexity of query-answering grows exponentially with the size of the query). What is the practical "maximum" of your RDBMS backend?
  • Java got you down? Thinking Scala will save your sanity? Maybe it will, but according to Michael Fogus in How We Thought We Could Use Scala And Clojure And How We Actually Did, it won't be in the way you think. All the really cool stuff you probably won't be use, but the stuff you do use will make all the difference.
  • Aaron Batalion tells how he finally became rich and famous using Ruby on Rails to make  LivingSocial. Actually quite funny and helpful: if you have two ideas do both of them; build in flexible infrastructure on which you can build apps quickly; execution matters and ideas don't; culture matters - build it; a little train can do more; know your metrics: acquisition, activation, retention, referral, revenue; be nervous, be scared, try harder; being unhappy is the secret to success.
  • TCP accept performance limits the number of connections a server can accept. Then Why is TCP accept() performance so bad under Xen? This is due to small packet performance in Xen.
  • Dany Rayburn with a CDN Summit Wrap Up: Transparent Caching, Federated CDN, Market Pricing & Sizing
  • Looking for more ways to apply functions to data? Then take a look at Jetty Continuations: Push Your Java Server Beyond Its Scalability Limits
  • GoogleTalk: Near-Optimal Parallel Join Processing in MapReduce. It's not just the size of the data, but the high dimensionality of data that's a problem. Science today is about knowing the domain and being and expert in data mining. The function-join pattern is a way of finding interesting patterns in the data.
  • Scoble interviews SeaMicro, showing us around their test datacenters and their innovative technology.