Stuff The Internet Says On Scalability For October 31st, 2014

Hey, it's HighScalability time:


A CT scanner without its clothes on. Sexy.

  • 255Tbps: all of the internet’s traffic on a single fiber; 864 million: daily Facebook users
  • Quotable Quotes:
    • @chr1sa: "No dominant platform-level software has emerged in the last 10 years in closed-source, proprietary form”
    • @joegravett: Homophobes boycotting Apple because of Tim Cook's brave announcement are going to lose it when they hear about Turing.
    • @CloudOfCaroline: #RICON MySQL revolutionized Scale-out. Why? Because it couldn't Scale-up. Turned a flaw into a benefit - @martenmickos
    • chris dixon: We asked for flying cars and all we got was the entire planet communicating instantly via pocket supercomputers
    • @nitsanw: "In the majority of cases, performance will be programmer bound" - Barker's Law
    • @postwait: @coda @antirez the only thing here worth repeating: we should instead be working with entire distributions (instead of mean or q(0.99))
    • Steve Johnson: inventions didn't come about in a flash of light — the veritable Eureka! moment — but were rather the result of years' worth of innovations happening across vast networks of creative minds.

  • On how Google is its own VC. cromwellian: The ads division is mostly firewalled off from the daily concerns of people developing products at Google. They supply cash to the treasury, people think up cool ideas and try to implement them. It works just like startups, where you don't always know what your business model is going to be. Gmail started as a 20% project, not as a grand plan to create an ad channel. Lots of projects and products at Google have no business model, no revenue model, the company does throw money at projects and "figure it out later" how it'll make money. People like their apps more than the web. Mobile ads are making a lot of money.  

  • Hey mobile, what's for dinner? "The world," says chef Benedict Evans, who has prepared for your pleasure a fine gourmet tasting menu: Presentation: mobile is eating the world. Smart phones are now as powerful as Thor and Hercules combined. Soon everyone will have a smart phone. And when tech is fully adopted, it disappears. 

  • How much bigger is Amazon’s cloud vs. Microsoft and Google?: Amazon’s cloud revenue at more than $4.7 billion this year. TBR pegs Microsoft’s public cloud IaaS revenue at $156 million and Google’s at $66 million. If those estimates are correct than Amazon’s cloud revenue is 30 times bigger than Microsoft’s.

  • Great discussion on the Accidental Tech Podcast (at about 25 minutes in) on how the the time of open APIs has ended. People who made Twitter clients weren't competing with Twitter, they were helping Twitter become who they are today. For Apple, developers add value to their hardware and since Apple makes money off the hardware this is good for Apple, because without apps Apple hardware is way less valuable. With their new developer focus Twitter and developer interests are still not aligned as Twitter is still worried about clients competing with them. Twitter doesn't want to become an infrastructure company because there's no money in it. In the not so distant past services were expected to have an open API, in essence services were acting as free infrastructure, just hoping they would become popular enough that those dependent on the service could be monetized. New services these days generally don't have full open APIs because it's hard to justify as a business case. 

  • We need a Big O notation for costs. GCE is going SSD. You can buy 680,000 4K reads/sec in a single VM for $0.0003/GB/hour. So high performance and low latency. Is it a good value? An interesting development is that I have no idea. I don't ever think of buying SSD reads on a per hours basis so I have no feeling for how it fits. If gas is $6 a gallon I know that's a bad deal. But this, who knows?

  • Not fluff, useful stuff. How to Build Products Users Love: What a lot of people keep forgetting is that there's almost no difference between an increase in conversion rate, 1% increase, and 1% decrease in churn; they do the exact same thing to your growth.; however, the latter is actually much easier to do, and much cheaper to do. And a lot of times we neglect this until way far along, and we usually have our B team work on these projects and services.

  • Is Docker useful only for little projects? Nope. What Iron.io Learned Launching Over 300 Million Containers. The good: Easy to Update and Maintain Images, Resource Allocation and Analysis, Easy Integration With Dockerfiles, A Growing Community, Docker + CoreOS. The challenges: Limited Backwards Compatibility, Limited Tools and Libraries, Long Deletion Times. Excellent real-word experience report. 

  • What really matters? The 99th percentile matters: One reason percentiles are a useful statistic is because they give us insight in how outliers impact our design choices. Sometimes our architectural design choices are not sympathetic to latency spikes or other environmental degradations (or sometimes even performance spikes with unpredictable positive gains!). 

  • Often unseen, algorithm magic pervades the world. A magician explains. Reed–Solomon codes for coders: In this essay, I will attempt to introduce the principles of Reed–Solomon codes from the point of view of a programmer rather than a mathematician. I will provide real-world examples taken from the popular QR code barcode system as well as working code samples. I chose to use Python for the samples.

  • Trust but verify. Even your hardware. Performance Tuning ~ Writing an Essay: Losing faith in hardware profiling being remotely representative of reality makes me a sad panda; I now have to double check perf profiles when hunting for misleading metrics. At least I can tell myself that knowing about this phenomenon helps us make better informed – if less definite – decisions and ferret out more easy wins.

  • A fun and effective way of explaining bufferbloat. A damp discussion of network queuing: Stephen put together a set of demonstrations where a network queue was represented by an inverted plastic bottle. The bottle could hold a fair amount of water (packets), but there are limits on how quickly the water can drain out. So if water arrives more quickly than the bottle can drain, the bottle begins to fill. If the  [Steve Hemminger] bottle is quite full, a drop of water added at the top will take a long time to reach the opening and exit the bottle — especially if the bottle is large. Bufferbloat, thus, was represented as bottlebloat.

  • The series continues. Building Carousel, Part III: Drawing Images on Screen: We ran with the last approach and built an image renderer that contains a queue of 256px by 256px rendering jobs. After experimentation we settled on caching the resulting bitmaps, with a configurable cache size, which allows us to hold on to the most recently decoded thumbnails

  • Are you as FIT as Netflix? Compare yourself against FIT : Failure Injection Testing.

  • What is pipelining you ask...Ross Bencina answers: breaking a task up into multiple stages so that the stages can be overlapped and executed in parallel.

  • Seems quite sensible and shows speed ups of 2x to 7x. Aggressive Data Skipping for Querying Big Data: The idea is very simple: big data files are partitioned into fairly small blocks of say, 10,000 rows.  For each such block we store some metadata, e.g., the min and max of each column. Before scanning each block, a query can first check the metadata and then decide if the block possibly contains records that are relevant to the query.  If the metadata indicates that no such records are contained in the block, then the block does not need to be read, i.e, it can be skipped altogether.

  • The Network is Reliable (not). Networks fail. We got that. But it's more interesing in a mobile network centric world. We aren't considering mobile clients as part of the system. We assume there's sync magic between the client and the server and then the server does all the "real" work. That's not reality. We need a system that works with clients end-to-end over all networks.

  • mrb: Should I read papers? Yes. Was there any doubt? Written in a very entertainment format.

  • Here's something you don't hear talked about much anymore. Little-endian vs. big-endian. Fabian Giesen goes deep dissecting math vs. indexing/sorting/searching, Byte order vs. bit order, and Memory access. To sum up: The cost of having different endianness between machines is not [trivial]. At this point, the dominant CPU and GPU architectures all default to LE, with even POWER recently getting serious about making LE just work and using the opportunity to clean up some ABI issues in the process. I don’t particularly care which endianness I’m on, but given that we seem to be converging on “LE everywhere”, the last thing I want is a new architecture that goes BE and prolongs the confusion by another few decades.

  • ExaLink Fusion – Ultra low latency switch (exablaze.com). Excellent discussion on Hacker News of a wide range related issues. Speed isn't the only thing that matters.

  • Mysterious Statistical Law May Finally Have an Explanation: “The fact that it pops up everywhere is related to the universal character of phase transitions,” Schehr said. “This phase transition is universal in the sense that it does not depend too much on the microscopic details of your system.”

  • Drill and drill some more. The military is a big user of scenario training. If you wan't to know how you will handle a situation then set it up, run it, evaluate, repeat and improve. The same strategy works great for IT as is well explained in Game Day Exercises at Stripe: Learning from `kill -9`.

  • Does bare metal really need a defense? In Defense of Bare Metal: Mobile advertising platform provider Taptica has shifted almost all of its workload from the cloud to bare metal servers – currently it’s using around 100 and adding around four a month — and says it’s now getting better performance and a better price.

  • Synthetic biology on ordinary paper: a new operating system: “What we have been able to do is to create an in vitro, sterile, abiotic operating system upon which we can rationally design synthetic, biological mechanisms to carry out specific functions,” said Collins, senior author of the first study, “Paper-Based Synthetic Gene Networks"

  • fastos/fastsocket: a highly scalable socket and its underlying networking implementation of Linux kernel. With the straight linear scalability, Fastsocket can provide extremely good performance in multicore machines. In addition, it is very easy to use and maintain. As a result, it has been deployed in the production environment of SINA.

  • Characterizing Load Imbalance in Real-World Networked Caches: Potential causes of load imbalance include: hashing schemes, skewed access popularity, etc. Using real workload of Facebook’s TAO cache system, this paper tries to answer the following questions:

  • Theia: (Simple and Cheap) Networking for Ultra-Dense Data Centers: This talk focuses in networking problem raised from packing a huge number of CPUs into a rack. Theia suggests we rethink the ToR architecture used in many data centers that doesn't scale to connect thousands of CPUs. 

  • Crowdsourcing Access Network Spectrum Allocation Using Smartphones: The main idea proposed in this paper was to use a smartphone “within proximity” of the primary device (laptop or tablet) to collect measurements (channel utilization and WiFi scan results) without disrupting the primary device. The key participants in the PocketSniffer system are: the phone, the laptop, the PocketSniffer AP, the PocketSniffer server. Challenges that need to be addressed include the following

  • Murat says...Clock-SI: Snapshot Isolation for Partitioned Data Stores Using Loosely Synchronized Clocks: Clock-SI, instead, proposes a way to use loosely synchronized clocks to assign snapshot and commit timestamps to transactions. Compared to conventional SI, Clock-SI does not have a single point of failure and a potential performance bottleneck. It saves one round-trip message for a ready-only transaction (to obtain the snapshot timestamp), and two round-trip messages for an update transaction (to obtain the snapshot timestamp and the commit timestamp). A transaction's snapshot timestamp is the value of the local clock at the partition where it starts. Similarly, the commit timestamp of a local update transaction is obtained by reading the local clock.