hot links

Stuff The Internet Says On Scalability For May 17, 2013

High Scalability

17 May 2013 — 8 min read

Hey, it's HighScalability time:

(Earth sized solar flare, some more flair)

Google I/O to world: Just try to keep up with us. You can't. But go ahead and try. Nah na na na nah...

17 billion: Google Cloud Messaging messages per day with 60ms latency; 1B page views: 500px; 121 billion: edge graph using Titan; 4 billion hours: hours watched on Netflix per quarter; 4.5 trillion: BigTable transactions per month

Quotable Quotes:
- to3m: As with any time you make plans for the future, sometimes you get it wrong. Ars longa vita brevis, and all that.
- Callaghan’s law: a given row can’t be modified more than once per RTT
- Josh Haberman: I had an epiphany one day when I realized that the kernel is nothing but a library with an expensive calling convention.
- fread2281: Insane speed calls for insane measures.
- Luke Gorrie: hardware really wants to run fast and you only need to avoid getting in the way -- not too hard if you write the whole stack to match your application, but very hard if you depend on abstractions and misunderstand what's going on.
- Francis Stephens: This exposes an important, and to me non-obvious, property of concurrency. That it's not the locking that's really hard, it's how to be sure that every piece of related data is included in the lock (or STM).
- @jamesurquhart: "Complexity is a characteristic of the system, not of the parts in it." -Dekker
- Colin Scott: out of all the datacenter links types, the average downtime was 0.3 days. This translates to roughly three and a half 9’s of reliability, an order of magnitude greater than WAN links.
- @adocortes: GPU vs CPU 40x faster for image processing in clusters

Really fast growth really does happen says someone somewhere: Dots game from Betaworks hits 100 million game plays in first 2 weeks.

If you love something you should set it free or lose everything. Fred Wilson observes: This is a classic case of the innovator's dilemma. RIM felt that letting BBM out in the open would make it easier for Blackberry users to leave. So they kept it proprietary. For way too long. Now they no longer have a dominant smartphone franchise or a dominant mobile messenger franchise.

When Big Data ecosystems start merging it's not the end of the world, but building a different world: Amex to tap big data (TripAdvisor) to expose fake reviews.

Google Compute Engine is now openish to all. It supports short lived workloads with sub-hour billing, which is a win over Amazon's Costco style pricing scheme. GCE now brings SDN to the masses. Did not see that coming. Google Cloud Datastore is now open for business. Pricing seems to be on par with DynamoDB. StackOverflow really should answer questions like this. Lee Schlesinger with a good Introduction to the Google Cloud Platform from an Insider. Touches on the themes of reliability and speed and simplicity.

Google has bought a quantum computer and will soon need a spin or two of Quantum mechanics. I smell the Singularity at work.

It's all about the triangles. A Little Graph Theory for the Busy Developer. Jim Webber with an entertaining presentation on stuff you probably didn't know about graphs. It goes beyond the typical discrete math idea of graphs and covers how you use graphs to do things like partitioning graphs by finding strong triangle relationships. It turns out this is nearly optimal. It's not optimal because you need more data about the relationships to make better decisions. Also like the idea of forming categories via patterns in the graph in real-time. Also, Networks, Crowds, and Markets.

Data Center Servers Suck: The utilization rates of about 1,000 Mozilla servers. Here’s what he found: the average CPU utilization rate was 6 percent; memory utilization was 80 percent; network I/O utilization was 42 percent.

MySQL Cluster 7.3 Improvements - Connection Thread Scalability: We have split the transporter mutex and replaced it with mutexes that protects sending to a specific data node, mutexes that protects receiving from a certain data node, mutexes that protect memory buffers and mutexes that protect execution on behalf of a certain NDB API connection. This means a significant improvement of throughput per API node. If we run a benchmark with just one data node using the flexAsynch benchmark that handles around 300-400k transactions per second per API node, this improvement increases throughput by around 50%.

Mark Thompson brings up another performance hit in highly concurrent servers: A big issue for me is when he defined C10M he did not mention the TIME_WAIT issue with closing connections. Creating and destroying 1 million connections per second is a major issue. A protocol like HTTP is very broken in that the server closes the socket and therefore has to retain the TCB until the specified timeout occurs to ensure no older packet is delivered to a new socket connection. Also, Avoiding the TCP TIME_WAIT state at Busy Servers.

Virident vCache vs. FlashCache: Part 2: When the working set outstrips the available buffer pool memory but still fits into the cache device, vCache shines. Compared to a deployment with no SSD cache whatsoever, FlashCache still does quite well, massively outperforming the HDD-only setup, but it doesn’t even really come close to the numbers obtained with vCache

Luke Gorrie gives details on Snabb's architecture, which is an open source virtualized Ethernet networking stack: Linux is developing pretty nice opt-out mechanisms though, despite Linus's best intentions. We take large blocks of RAM (HugeTLB) and whole PCI device access (mmap via sysfs) from Linux and then do everything else ourselves in userspace. The kernel is more like a BIOS in that sense: it gets you up and running and takes care of the details you really don't care about. Design document.

Amdahl's law in reverse: the wimpy core advantage. Yossi Kreinin with a great exploration of the great brawny vs wimpy core debate. Memory latency brings parity. Summary: 1) Faster processors are exactly as slow as slower processors when they're stalled; 2) Many slow processors are actually faster than few fast ones when stalled; 3) All this on top of area & power savings of many wimpy cores compared to few brawny ones.

To be sustainable you must make money. Here's an excellent list of possible money making strategies: The Money Making Techniques - Every Start-up should know: people want to look professional; people want to belong to a group, something bigger; to experience and discover new things; people hate to lose things; people love to complete a set; able to control privacy; to speed things up; let people be first/focus on exclusivity; give users the right to tweak/control; let user personalize their service; people will always pay for convenience; add enough time pressure; people want to limit negative feelings; Some like to brag or show off; people like to get insights; let people communicate/connect; people want to get notified.

Some options on how to store your sensor data: RDBMS, NoSQL, round-robin databases like RRDtool, and time series databases.

Kristian Nielsen with interesting Thoughts on Global Transaction ID, parallel slave, and multi-source replication.

Advertising is no doubt a vacuous domain, but it's also an inherently challenging and therefore interesting to programmers. To see what's going on behind an ad display Behind the Banner has come up with an amazing dramatization: The entire ad placement network is one of the most complex computational systems on the planet. Behind The Banner is an attempt to understand the underlying interactions that define this ecosystem, and how they impact our daily use of the web.

Networking has been the bedrock against which open could not pass. No more. Open Compute Project moves into networking: The Open Compute Project (OCP) has kicked off a new effort to build networking hardware for use in data centres. The group said that its new project will be aimed at designing a networking switch that is operating system agnostic and open for all.

There more than one way to quickly move packets around...netmap: A Novel Framework for Fast Packet I/O: we identified and successfully reduced or removed three main packet processing costs: per-packet dynamic memory allocations, removed by preallocating resources; system call overheads, amortized over large batches; and memory copies, eliminated by sharing buffers and metadata between kernel and userspace.

Nassim Nicholas Taleb - Fat Tails and (Anti)fragility - Lectures on Probability, Risk, and Decisions in The Real World. The probability of lots of math is 100%.

If you are into developing hardware Chris Dixon has a great post on Hardware startups. Hardware is getting cheaper to build, but keep in mind that there's now AWS for hardware; hardware generally doesn't have network effects; build-test-iterate does not apply to hardware; there's a lot of opportunity in B2B hardware. Also, Foxconn, Here We Come! Dragon Innovation Teaches Startups How to Get Stuff Made.

A code for concepts. Our brain has a way of storing concepts and building relationships that might be useful in digital systems. Brain Cells for Grandmother spills the secrets. A typical person remembers no more than 10,000 concepts. Our brains may use a small number of concept cells to represent many instances of one thing as a unique concept. A sparse and invariant representation distributed in the medial temporal lobe. Related concepts fire together building associations. Sparse representations are necessary for rapid associations. Concept cells link perception to memory; they give abstract and sparse representation of semantic knowledge--the people, places, objects, all the meaningful concepts that make up our individual worlds. They constitute the building blocks for the memories or facts and events of our lives. Their elegant coding scheme allows our minds to leave aside countless unimportant details and extract meaning that can be uses to make new associations and memories.

Josh Haberman on the considerable benefits of userspace: Because this is in user-space, LLVM was able to form and grow alongside GCC. People didn't have to make a big switch just to try out LLVM; it's not intrusive like it would be to switch from Linux to FreeBSD or anything like that. That's why I think the network effects of user-space are better. In the kernel it's "get merged or die." In user-space, similar projects can compete and people can vote with their linker lines...None of these upstart competitors had to ask permission or get buy-in from the incumbents, they just did it.

Drawing Dynamic Visualizations. Bret Victor with an awesome demonstration of a magical dynamic drawing capabilities.

Epic Bayesian Programming and Learning for Multi-Player Video Games: This thesis explores the use of Bayesian models in multi-player video games AI, particularly real-time strategy (RTS) games AI. Video games are in-between real world robotics and total simulations, as other players are not simulated, nor do we have control over the simulation.

Snabb Switch: we take large blocks of RAM (HugeTLB) and whole PCI device access (mmap via sysfs) from Linux and then do everything else ourselves in userspace. The kernel is more like a BIOS in that sense: it gets you up and running and takes care of the details you really don't care about.

DConf 2013 Day 1 Talk 3: Distributed Caching Compiler for D: In this presentation, which reflects the content of my master thesis, I present a D compiler written in D that exploits these tremendous hardware advances. To utilize multiprocessors the lexer and parser are run in different communicating threads. On top of that different semantic analysis are spread across multiple threads. To speedup compilation, by use of a cache and work distribution among a network, data structures are created that store results, like the abstract-syntax- tree, in a consecutive chunk of memory. To remove manual labor a lexer- and parser-generator were created that use a custom library that, not only, consists of stl inspired containers.

Stuff The Internet Says On Scalability For May 17, 2013

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale