Stuff The Internet Says On Scalability For September 12th, 2014

Hey, it's HighScalability time:


Each dot in this image is an entire galaxy containing billions of stars. What's in there?

  • Quotable Quotes:
    • mseepgood: Or "another language that's becoming popular, Node.js"
    • Joe Moreno: What good are billions of cycles of CPU power that make me wait. I shouldn't have to wait longer and longer due to launching, buffering, syncing, I/O and latency.
    • @stevecheney: Apple Pay is the magic that integrated hardware / software produces. No one else in the world can do this.
    • @etherealmind: Next gen Intel Xeon E5 V3 CPU includes packet processor for 40GBE, 30x increase in OpenSSL crypto, 25% increase in DPDK perf. #IDF14
    • @pbailis: There's actually an interesting question in understanding when to break "sharing" -- at core, NUMA domain, server, or cluster level?
    • @fmueller_bln: Just wait some minutes for vagrant to provision a vm with puppet and you’ll know why docker may be better option for dev machines...

  • Encryption will make fighting the spam war much costlier reveals Mike Hearn in an awesome post: A brief history of the spam war, where he gives insightful color commentary of the punch counter punch between World Heavyweight Champion Google and the challenger, Clever Spammer.  Mike worked in the Gmail trenches for over four years and recommends: make sending email cost money; use money to create deposits using bitcoin. 

  • jeswin: No other browser can practically implement or support Dart. If they do their implementation will be slower than Google's, and will get classified as inferior. < Ignoring the merits of Dart, this is an interesting ecosystem effect. By rating sites for non quality of content reasons Google can in effect select for characteristics over which they have a comparative advantage. It's not an arms length transaction. 

  • Dateline Seattle. Social media users execute a coordinated denial of service attack on cell networks, preventing those in need from accessing emergency services. Who are these terrorists? Football fans. City of Seattle asks people to stop streaming videos, posting photos because of football. Tweets, Instagram, YouTube, and Snapchat are overloading the cell networks so calls can't get through. Should the cell network expand capacity? Should there be an app tax to constrain demand? Should users pay per packet? As a 49ers fan I have another suggestion...move games to a different venue, perhaps the moon. That will help.

  • Are you a militant cable cutter who thinks the future of  TV is the Internet? Not so fast says Dan Rayburn in Internet Traffic Records Could Be Broken This Week Thanks To Apple, NFL, Sony, Xbox, EA and Others: Delivering video over the Internet at the same scale and quality that you do over a cable network isn’t possible. The Internet is not a cable network and if you think otherwise, you will be proven wrong this week. We’re going to see long download times, more buffering of streams, more QoS issues and ISPs that will take steps to deal with the traffic. 

  • Ted Nelson takes on the impossible in on How Bitcoin Actually Works (Computers for Cynics #7). And he does an excellent job, sharing his usual insight with a twist. The title is misleading however. There's hardly any cynicism. How disappointing! Ted is clearly impressed with the design and implementation of bitcoin. For good reason. No matter what you think of bitcoin and its potential role in society, it is a very well thought out and impressive piece of technology. On par with Newton, Mr. Nelson suggests. If you watch this you'll probably realize that you don't actually understand bitcoin, even if you think you do, and that's a good thing.

  • Here's a great visualization of how a consensus protocol works, Raft in particular. With great clarity it takes you through complicated ideas like leader election and what happens when there's split brain. If you are interested in how the presentation is made then take a look at Playback.js. And of course here's the excellent paper describing Raft. Don't be fooled though, making a working system using Raft is still something that takes a lot of work and skill. 

  • Writing a High Performance Database in Go. Why? Go is really fast. Easy deployment. Good API. Simple debugging. Use channels to batch operations. JSON encoding is slow. I like hierachy of speed: network/IO, disk IO, memory allocation, mutexes, memory access. 

  • Ever fear that there's more to the world than we understand? 'Solid' light could compute previously unsolvable problems: researchers are not shining light through crystal -- they are transforming light into crystal. As part of an effort to develop exotic materials such as room-temperature superconductors. 

  • Cinematic Cuts Exploit How Your Brain Edits What You See: Our brains evolved to take this firehose of sensory information and boil it down to something we can use effectively to survive, and movies leverage that feature of our biology to shape our experience.

  • Like peanut butter and chocolate. CoreOS Image Now Available On DigitalOcean. Wonderfully summarized by drcode: "Docker is a technology that lets a VM use resources of the host (threads, etc) directly, so you get VMs that are super super light weight. With Docker, even your laptop can run 100 VMs, all with different OSes and customizations (though linux only) without breaking a sweat. CoreOS is a special linux flavor designed to be used specifically in this role, making the VMs even MORE light weight (and has features to coordinate multiple VMs working together.) DigitalOcean allows you to host VMs in the cloud. Now you have the full shebang: A hosting provider designed to run an OS designed to run efficiently as a VM."

  • Kevin Marks with some notes on IndieWebCamp UK 2014-09-06. Like: "the idea is that we can replace all forms of contact - phone, chat, whatever with URLs"; "I got tired of hosting my own server so I move to Jekyll on github pages for everything at http://voxpelli.com." Generally I'm reminded of the back to the land movement. A highly principled, spirited, and passionate group, but most people still like living in civilization.  

  • Lots of wisdom here. Stack Overflow – performance lessons (part 1, part 2): take performance seriously; remove all the work the .NET Garbage Collector has to do, thus eliminating the pause; do a lot of crazy things to improve JSON serialization/deserialization. 

  • Non-Digital Computers:  not all computers need to be digital, or even electronic! A computer can be mechanical,  made of dominoes, or even just a rules system in a card game. < Or even a phone.

  • Stanford engineer aims to connect the world with ant-sized radios: A Stanford engineering team has built a radio the size of an ant, a device so energy efficient that it gathers all the power it needs from the same electromagnetic waves that carry signals to its receiving antenna – no batteries required. < Issues with high frequencies, low power, and needing a base station, but it's at the right cost point to become ubiquitous.

  • Joe Landman:  Imagine 60x 8TB drives (480TB about 1/2 PB) in a 4U unit or 4.8PB in a rack. Now make those 10TB drives. 600TB in 4U. 6PB in a full rack...Just remember though that the larger the single storage element, the higher the storage bandwidth wall … the time to read/write the entire element. The higher this wall is, the colder the data is.

  • Nice looking big view of the system. Overview poster of Azure features, services, and common uses. Useful to see how all the parts fit together.

  • Performance benchmarks: KVM vs. Xen: KVM is almost always within 2% of bare metal performance. Xen fell within 2.5% of bare metal performance in three out of ten tests but often had a variance of up to 5-7%. 

  • Array of Things. Chicago isn't just talking about the future, they are building it with a "network of interactive, modular sensor boxes around Chicago collecting real-time data on the city’s environment, infrastructure, and activity for research and public use." Here's a diagram of what it each sensor complex looks like. It collects all sort of data: light, sound, CO, NO2, temp, humidity, wind, rain, and more. Data will be published every minute at no cost. Look for some cool stuff coming out of Chicago soon.

  • What kind of network are you? What Can We Learn From the Wealth of Virtual Nations?: the wealthy organize their networks so they are the "hub of a star-like network." Wealthy players trade with many others, while their trade partners traded with fewer others, and very little among each other. In the friendship and enmity, the wealthy are well-liked and withhold animosity—except to public enemies.

  • Matt Bell with a clear and simple Distributed Rate Limiting With Redis, with code.

  • Papers from the Collective Intelligence 2014 with compelling titles like "Information Spread in a Connected World", "Wisdom of crowds in practice", "Nowcasting the Bitcoin Market with Twitter Signals."

  • zimg:  a light image storage and processing system. It's written by C and it has high performance in image field. The zimg is designed for high concurrency image server. It supports many features for storing and processing images.

  • DimmWitted: A Study of Main-Memory Statistical Analytics: NUMA-awaremachines, we studied tradeoffs in access methods, modelreplication, and data replication. We found that using novelpoints in this tradeoff space can have a substantial bene-fit: our DimmWitted prototype engine can run at leastone popular task at least 100× faster than other competitorsystems. This comparison demonstrates that this tradeoffspace may be interesting for current and next-generationstatistical analytics systems.

  • Optimizing Google’s Warehouse Scale Computers: The NUMA Experience: this paper investigates the impact of non-uniform memory access (NUMA) for several Google’s key web-service work- loads in large-scale production WSCs. Leveraging a newly- designed metric and continuous large-scale profiling in live datacenters, our production analysis demonstrates that NUMA has a significant impact (10-20%) on two important web-services: Gmail backend and web-search frontend. Our carefully designed load-test further reveals surprising tradeoffs between optimizing for NUMA performance and reducing cache contention.