hot links

Stuff The Internet Says On Scalability For October 16th, 2015

High Scalability

16 Oct 2015 — 11 min read

Hey, it's HighScalability time:

The other world beauty of the world's largest underground Neutrino Detector. Yes, this is a real thing.
If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.

170,000: depression era photos; $465m: amount lost due to a software bug; 368,778: likes in 4 hours as a reaction to Mark Zuckerberg's post on Reactions; 1.8 billion: pictures uploaded every day; 158: # of families generously volunteering to privately fund US elections.

Quotable Quotes:
- @PreetamJinka: I want to run a 2 TB #golang program with 100 vCPUs on an AWS X1 instance.
- Richard Stallman: The computer industry is the only industry that is more fashion-driven than women's fashion.
- The evolution of bottlenecks in the Big Data ecosystem: Seeing all these efforts to bypass the garbage collector, we are entitled to wonder why we use a platform whose main asset is to offer a managed memory, if it is to avoid using it?
- James Hamilton: Services like Lambda that abstract away servers entirely make it even easier to run alternative instruction set architectures.
- @adrianfcole: Q: Are we losing money? A: Can't answer that, but I can tell you what average CPU usage was 5ish mins ago..
- h4waii: Because you can't buy trust through an acquisition. You build trust, you don't transfer it through a merger.
- @mathiasverraes: TIL Ada Lovelace was not only the world's first programmer, she was also the first debugger, fixing a flaw in an algorithm by Babbage.
- @BenedictEvans: Ways to think about scale: iOS is as big as BMW, Mercedes, Lexus & Audi combined
- @caitie: Really enjoyed The Martian, also began thinking about how space is the true test of any distributed system
- Bits or Pieces?: This is the point, there are two very distinct forms of disruption. It's not all the same despite everyone treating is as such. Alas, people ignore this.
- Julien CROUZET: So when you have a function or callback that’ll be called repeatedly, try to make it under 600 characters (or your tweaked value), you’ll have a quick win !
- exelius: They're the walking dead because they pursued scale over innovation. Once they had achieved scale, they found themselves with too much momentum to innovate. So because they couldn't innovate, they built an army of consultants to hawk their wares to customers who also valued scale.

Will Amazon automatically win the IoT space with their recent announcement? Not so fast says Greg Ferro: AWS IoT vs Cisco Fog Computing – Cloud vs Network IoT: AWS is popular with capital poor, low ARPU and fast moving companies in the consumer market. Cisco et al is popular with high net worth conglomerates who build high value, high profit solutions that are slow moving and built on incumbent positions with known and trustable technology partners. There is a market for both types of approaches. One does not “kill” the other, nor it one better or worse, but does limit possible growth and ability to dominate the market.

Conway's Law is being used less descriptively these days and more prescriptively. Projects are choosing the organizational structure that creates the software they want to make. From disutopia to utopia.

In a single day Riot chat servers can route a billion events (presences, messages, and IQ stanzas) and process millions of REST queries. Here are lots of lovely details on the League of Legends chat service architecture and how it works. It's based on Erlang and XMPP, leveraging the OTP framework, concurrency model, and fault-tolerance semantics. For the heaviest string manipulation parts they dropped into C, which save 60% on CPU and lots of per session memory. Chat clusters are independent, but they do share a few tables that are replicated and reside in memory. Riak is used for the database. Also, JENKINS, DOCKER, PROXIES, AND COMPOSE.

Brother, can you spare a dime so I can scale my website? That's all it took to handle 60K unique visitors on Amazon's Lambda + S3, less than a dime. No doubt the original architecture could have worked with a few tweaks, but the point here is using JAWS did work. Though a problem I've had with Lambda is the lack of the idea of a session when you have a single page app.

Another story of heroic scaling. MattBlumTheNuProject: We hit the front page of Reddit with our site thenuproject.com and had 200,000 uniques in two hours. Our site is built in Angular with a Laravel backend and serves 500-600 images in high resolution and is behind Cloudflare. The load on our server never went above 2 with an eight core machine.

It's still a disk based world. Sorry flash. David Rosenthal: Unless the investment to make a petabyte of flash per year is much less than the investment to make a petabyte of disk, disk will remain the medium of choice for bulk storage.

How This Battery Cut Microsoft Datacenter Costs By A Quarter. Microsoft embedded a rechargeable battery in each server rather than the usual centralized APC system used in datacenters. They took advantage of the same battery-operated, rechargeable units used in hand tools.

Benchmark alert. All benchmark caveats apply. Benchmark: PostgreSQL, MongoDB, Neo4j, OrientDB and ArangoDB. This is a test by ArangoDB that found ArangoDB benchmarked very well in the benchmarks they chose to run. But everything is public so you could learn something.

ARM Server Market. James Hamilton thinks the long awaited microservers market may finally be on the horizon. Qualcomm is sampling new 24-core ARM-based server CPU. We'll see if the dream of low cost, low power servers for high-scale internet services can become a reality.

Datanauts 011: Understanding Leaf-Spine Networks. Great explanation of the old three tier north-south model and the transition to the new east-west model.

A good overview of the Best of Strange Loop 2015.

You may find this uncomfortably close to your typical meeting experience. How to Make Sure Nothing Gets Done at Work. These are techniques from a CIA field manual on how to disrupt an organization from within: Insist on doing everything through channels. Never permit short-cuts to be taken to expedite decisions; Make speeches. Talk as frequently as possible and at great length; Refer all matters to committees; Bring up irrelevant issues as frequently as possible; Haggle over precise wordings; Refer back to a matter decided upon at the last meeting and attempt to re-open issues; Advocate ‘caution’; Be worried about the propriety of any decision.

AWS or Digital Ocean? rubiquity: It's certainly easier to launch a droplet on Digital Ocean but Security Groups and VPCs save so much time for just about any application I've ever put in production. I badly want Digital Ocean to have a service similar to Security Groups and VPCs so I don't have to munge my Ansible scripts to setup iptables rules and PKI for encryption between droplets.

Presentations from vBSDcon 2015 are now available.

Yah, I guess everything is basically a variant of fire, rock, stick, or a wheel. You Call This Progress?: The big deals are: the computer revolution, the internet, mobile phones, GPS navigation, and surely some medical innovations. But I would characterize these as substantial refinements in pre-existing gizmos.

datacenter exodus is the winning meme of the week. devonkim: I'm working directly on one of these datacenter exodus projects named in this article and the cost savings early on are already monstrous compared to the charge-back that's done internally, and our internal cost to support cloud is projected to scale logarithmically as opposed to super-linearly while maintaining our in-house IT.

The Rise and Fall of the Operating System: there is no reason to port and cram an operating system into every problem space. Instead, we can split the operating system into the “orchestrating system” (which also has the catchy OS acronym going for it) and the drivers. Both have separate roles. The drivers define what is possible. The orchestrating system defines how the drivers should work and, especially, how they are not allowed to work. The two paths should be investigated relatively independently as opposed to classic systems development where they are deeply intertwined.

A good guide to using Chromium’s profiling. Ludicrously Fast Page Loads - A Guide for Full-Stack Devs.

Taming Consensus in the wild: using Paxos for queueing or messaging service is a bad idea. When the number of messages increase, performance doesn't scale...What is the right way of approaching this then? Use chain replication...Apache Kafka and Bookkeeper work based on this principle and are the correct ways to address the above two scenarios.

More Lambda love. Skopenow: People Search Made 8x Faster with AWS Lambda. Previously a concurrent search required 16 EC2 instances with a 90 second spin up time per instance. So 10 searches required 160 instances at a cost of $3500 per month. Switching to Lambda: we can now process all 1,000 images within 7 seconds. Compared to our previous stack, where we were processing 500 images within 60 seconds, we now generate our images nearly 8x faster

Vector Clocks Revisited: Since we released Riak 1.0, we no longer need to talk about “vclocks”, but instead a “causal context” or just “context”. Vnode Version Vectors solved some difficult issues for Riak users around availability and client process management, but they came with some costs.

Are you feeling artistic? Do you love drones? Then combine them and make drone art! Watch Flyability's Flashy Drones Dance Around a Forest at Night. Just stunning.

Lichess is an open source chess game that has doubled in just ten months. 100,000,000 games have played, 78K unique visitors per day, 260,000 completed games per day, 14,688,000 chess moves played every day. Impressive. Even more impressive: this is all done on $416 of operational costs per month all covered by user donations.

Nice introduction to queueing theory from VividCortex.

Excellent overview of The rise of immutable data stores. The advantages of immutable databases: Fewer dependencies; Higher-volume data handling and improved site-response capabilities; More flexible reads and faster writes; Compatibility with microservices architecture, log-based messaging protocols, and Hadoop; Suitability for auditability and forensics, especially in data-driven, fully instrumented online environments.

Videos from the a16z Academic Roundable are available. There are sessions on: security, biotech, machine learning, VR/AR, big data.

Eve online is getting some upgrades: a full SAN mirror so that we can both maintain TQ and failover live, replicating a copy of the TQ Database across the ocean to Iceland, land of fire and ice; next generation of IBM servers called IBM FLEX; With TQ Tech III there will be 6x Everest Nodes; The 4x Microsoft SQL Database machines will have a whopping 768GB of RAM each running on 1866MHz. They have 2 Intel E7-8893 v3 - 3.2GHz CPU's with 4 cores (8 hyper-threaded) and 45MB cache; we are increasing throughput from 16Gbps to 30Gbps and maximum concurrent connections from 4 million to 24 million; we will be deploying a new Intelligent Routing Platform to optimize BGP routing.

You might be interested in the THE DATABASEOLOGY LECTURES - FALL 2015: Embedded databases: They're the boxer briefs of the database world in that they are underneath a wide variety of applications, including mobile devices, high performance OLTP systems, and large distributed systems.

An epic post on Visual Information Theory: Information gives us a powerful new framework for thinking about the world. Sometimes it perfectly fits the problem at hand; other times it’s not an exact fit, but still extremely useful. This essay has only scratched the surface of information theory – there are major topics, like error-correcting codes, that we haven’t touched at all – but I hope I’ve shown that information theory is a beautiful subject that doesn’t need to be intimidating. < OK, I admit that it was still very intimidating.

Storing a lot of objects in RAM is much harder than you might think, especially in managed environments like the JVM and CLR. Here's a good description of the problems: Big Memory .NET Part 1 – The Challenges in Handling 1 Billion Resident Business Objects. Here's a good description of a solution: Big Memory .NET Part 2 - Pile, Our Big Memory Solution for .NET.

How both TCP and Ethernet checksums fail. Always use application level CRCs. This is a lesson from Twitter that found data corrupted in memcache because TCP checksums are weak: Ethernet CRC is strong, so how could a corrupt packet pass both checks? The answer is that the Ethernet CRC is recalculated by switches. If the switch corrupts the packet and it has the same TCP checksum, the hardware blindly recalculates a new, valid Ethernet CRC when it goes out.

This is clever. And truthy. Amdahl to Zipf: Ten Laws of the Physics of People: Everything we do has an economic motive; The bigger the team, the more force it needs; When you push an organization, it pushes back; Falling is indistinguishable from making progress; If it can break, it will break; The more you know about one topic, the stupider you become; 20% of any system always has 80% of the power; Cool stuff gets 50% cheaper every 18-24 months; The more you need consensus, the less work you can do; The software you make looks like your organization.

Let the incomparable Ilya Grigorik be your Virgil as you travel through the many layers of HTTP/2.

Videos from AWS re:Invent 2015 are now available. As is typical with Amazon there were very few announcements and only a few topics were covered.

Well-known Databases Use Different Approaches for MVCC: The first approach is to store multiple versions of records in the database, and garbage collect records when they are no longer required; The second approach is to keep only the latest version of data in the database, but reconstruct older versions of data dynamically as required by using undo.

Today’s economy is unevenly innovative. Beyond the Internet, Innovation Struggles: Screening job postings on Indeed, a job website, Mr. Mandel finds that the proportion mentioning “Android” (Google’s mobile operating system), “fracking” and “robotics” has risen notably in the past four to six years. But the proportion mentioning “composite materials,” “biologist,” “gene” or “nanotechnology” has trended down.

From Radio to Porn, British Spies Track Web Users’ Online Identitie: As of 2012, GCHQ was storing about 50 billion metadata records about online communications and Web browsing activity every day, with plans in place to boost capacity to 100 billion daily by the end of that year. The agency, under cover of secrecy, was working to create what it said would soon be the biggest government surveillance system anywhere in the world.

Perhaps programming is best characterized as a folk art? Programming exists at the interplay of individuality and creativity.

Single RX queue kernel bypass in Netmap for high packet rate networking: Since the Linux Kernel can't really handle a large volume of packets, we need to work around it. During packet floods we offload selected network flows (belonging to a flood) to a user space application. This application filters the packets at very high speed. Most of the packets are dropped, as they belong to a flood. The small number of "valid" packets are injected back to the kernel and handled in the same way as usual traffic.

You have to love this one. Frogs resolve computing issues: Researchers from the University of the Basque Country (UPV/EHU) and the Technical University of Catalonia have used the Japanese tree frog's mating rituals to develop new computational algorithms. The males of this species have learned to desynchronize their singing patterns so females can tell them apart. "This process is a great example of self-organization in nature, which has allowed us to develop bio-inspired algorithms"

prometheus.io: An open-source service monitoring system and time series database.

FiloDB~ a new open-source distributed columnar database that is designed to ingest streaming data of various types, including machine, event, and time-series data, and run very fast analytical queries over them. In four-letter acronyms, it is an OLAP solution, not OLTP.

nomad: Nomad is a cluster manager, designed for both long lived services and short lived batch processing workloads.

Charon: Declarative Provisioning and Deployment: a tool for automated provisioning and deployment of networks of machines from declarative specifications. Building upon NixOS, a Linux distribution with a purely functional configuration management model, Charon specifications completely describe the desired configuration of sets of “logical” machines, including all software packages and services that need to be present on those machines, as well as their desired “physical” characteristics.

SpaceCurve: a real-time, continuously updated, perpetually logged data view of everything that happens in the physical world for analysis at extreme scales. The term “database” is used loosely.

Stuff The Internet Says On Scalability For October 16th, 2015

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale