hot links

Stuff The Internet Says On Scalability For August 12th, 2016

High Scalability

10 Aug 2016 — 9 min read

Hey, it's HighScalability time:

The big middle finger to the Olympic Committee. They pulled this video of the incredibly beautiful Olympic cauldron at Rio.

If you like this sort of Stuff then please support me on Patreon.

25 years ago: the first website went online; $236M: Pokemon Go revenue in 5 weeks in 3 countriesSeveral thousand: work on Apple maps; 2500 Nimitz Carriers: weight of iPhone if implemented using tube transistors; $50 trillion: cost of iPhone in 1950, economic output of the world in your hand; 1000x: faster phase-change RAM; 15lbs: Americans heavier than 20 years ago; 2 years: for hacking the IRS; 3.6PB: hypothetical storage pod based on 60 TB SSD; 330,000: cash registers hacked; 162%: increased love for electric cars in China;

Quotable Quotes:
- @carllerche: it is hard to imagine how a node app could get closer to the metal with only 20MM LOC between the app and the hardware.
- David Heinemeier Hansson (RoR)~ Lots and lots of huge systems that are running the gosh darn Internet are built by remote people operating asynchronously. You don't think that's good enough for your little shop?
- Cesarini: Some frameworks that try to automate activities end up failing to hide complexity. They limit the trade-offs you can make, so they cater only to a subset of systems, often with very detailed requirements.
- "Uncle" Bob Martin: I have lived through 22 orders of magnitude growth of growth in hardware.
- Jovanovic: To use Bitcoin for real-time trades, we need to eliminate its lazy fork-resolution mechanism and adopt strong consistency, a more proactive approach that guarantees transaction persistence.
- Pedro Ramalhete: one latency distribution plot is worth a thousand throughput measurements
- @n1ko_w1ll: Impressive numbers: - 80% cut code with #scala - responsive at 90% load with #akka Impressive numbers: - 80% cut code with #scala- responsive at 90% load with #akka
- @samkroon: So Aussie government is asking 20 million ppl to login to one web site on the same night... Fail. Should have gone #serverless. #census2016
- @caitie: "My contribution to RPC is not to make another system based on RPC" @cmeik #NikeTechTalks
- @krisajenkins: This is your return type: Int / This is your return type on microservices: IO / (Logger (Either HttpError Int)) Microservices: Know the risks.
- @nosqlonsql: Latency drives throughput if you cannot achieve enough concurrency. Kafka vs Chronicle. Must read by @PeterLawrey
- reddit: Today's date is 100/1000/10000 in binary
- @caitie: "The languages we associate with distributed programming are really concurrent languages" @cmeik #NikeTechTalks
- @goserverless: Lambda down :( #aws #serverless
- @pkanavos: @goserverless I think I'll PaaS
- Jan Wedel: So if you plan to build an application from scratch and it is only meant to be used in on-premise scenarios as described, you probably shouldn't go for a microservice architecture.
- @bmoesta: Any industry that solely focuses on efficiency innovation is on the verge of death. Disruptive innovations that drive progress drive growth
- flak: It’s quite likely that your crypto will explode sooner or later, and it’s possible that random numbers will be implicated, but it’s very unlikely that some USB gizmo promising “true random” at kilobits per second will save you. Save your money instead.

Imagine how much the world has changed in those 25 years. The world's first website went online 25 years ago today. Without the Web the Internet would probably still be a backwater for researchers. The Web was the Internet's killer app. It's hard to imagine Pokemon is Augmented Realities' killer app. AR needs its let the people make it bigger and better technology. Given the balkanization of AR into proprietary silos AR may never have its Web moment. Will there be an HTTP for AR?

The phrase "small, reprogrammable quantum computer" doesn't sound remotely present-tense, but it is: Shantanu Debnath and colleagues at the University of Maryland reveal their new device can solve three algorithms using quantum effects to perform calculations in a single step, where a normal computer would require several operations. Although the new device consists of just five bits of quantum information (qubits), the team said it had the potential to be scaled up to a larger computer...the key to the new device was a system of laser pulses that drove the quantum logic gates, which operate like the switches and transistors that power ordinary computers.

Turning programmers into a proper profession, like doctors, is not the way to go. How much do doctors innovate? Very little. Doctors as a profession have been pounded into their current shape by two oppressors: fear of lawsuits and educational debt. Doctors are bound by best practices and oaths to do nothing interesting. What must programmers do constantly? Innovate and do the interesting. By not being a profession we are free to do harm, yes, but we are also able to create. Creation is a better failure mode than ossification. "Uncle" Bob Martin - "The Future of Programming". Nice gloss by Eric Fleming: Long story short this was really two talks in one. The first speech was about progress in hardware and software from 1945 to 2015. The second talk is about how there is so much growth in the programming field that there are too many young inexperienced people to do it right which necessitates some self regulatory body to bring young professionals into the flock. Ironically the talk his didn't intend to give, the first one is far more interesting than the talk he did give about how to fix the growing inexperience in industry.

Don't let what happened in Turkey happen to your coup attempt. Learn from experience. Here's your step-by-step guide on How to Overthrow a Government. Presented at, you may be surprised to hear, DefCon. First select from a menu of three overthrow methods: regime change: elections, coups and revolution. Next select a crack insurgency team from a handy wizard interface. Then there's a drop down list of intelligence gathering resources and funding options. After a few more clicks just press Go and you have your revolution (you'll certainly choose revolution, you get so many more points that way).

Tech Stack at Shots Quick Post: RedHat Enterprise 6 on the Front ends and DBs; Amazon Linux (Centos) on Elastic Search and Go servers; Apache 2+; Percona 5.6 XTRADB with some minor custom stuff (sharded); PHP; little bit of Python, JAVA; a lot of GO; poor man's AMP with the client reading from the clipboard, makes a call home where the link is sent to a distributed worker system which fetches the content of the HTML page, finds the media, manipulates the media and then distributes the media on our CDN.

A continuous release stream makes rolling back on failure much harder. And a possible insight into why time flows forward. Google Compute Engine Incident #16015. As you may recall postmortems are something of an art with Google. Interesting implication from asuffield: As soon as I or whoever is oncall has figured out what change was responsible, we can usually revert it quickly and easily. Usually, if I'm oncall and I have reason to even suspect a recent change might be the cause, I'll revert it and see if the problem goes away. The difficulty becomes more apparent when you realize the sheer number of infrastructure changes being made every hour, some of which will be fixes to other outages, and some of which will be things you can't revert because they are of the form "that location has fallen offline; probably lost networking" or "we are now at peak time and there are more users online". So if your question is "can we just roll the whole world back one day" - no, too much has changed in that time.

The advantage of running on a DC/OS. Mesosphere’s ‘Container 2.0’ Unites Stateless and Stateful Workloads: Now, adding a [Kafka] node is a matter of installing DC/OS on that node (physical or virtual), and enabling the system to absorb it into the existing framework without manual reconfiguration...So stream workloads may be redistributed as part of the node addition process, also without downtime...Mesosphere is working toward an orchestration framework where the format of the container is less important to the management of the system.

Gestalt Framework: executes “lambda functions” whenever an event triggers them: spinning up containers, running the task, and then killing the container automatically...The full Gestalt Framework is a series of microservices that focus on getting the policy environment right for enterprise seeking to move their applications to a cloud native platform...I would go so far as to say that, ultimately, serverless environments may be the way that microservices are implemented by most organizations.

If you had any notion financial markets were anything but gambling then here you go. High-Frequency Trading Is Nearing the Ultimate Speed Limit: A network switch made by the firm Metamoko allows a trade order to be placed in the time it takes a photon to travel about 90 feet...A small black device about the size of a pizza box could be the future of financial trading.

The future looks bright for crooks. The pathetic security in our IoT devices and even in very expensive cars means there will be exciting new money making opportunities in our future. High-Tech Car Crime Is Becoming a Big Problem: the team found that just four different cryptographic keys are used for as many as 100 million vehicles. After capturing another cryptographic key from the signals sent as a driver unlocks the car door, the researchers can combine the two numbers to unlock the target vehicle themselves.

You have to appreciate the simplicity. How Audi worked the V6 diesel emissions cheat: a 22-minute timer. Volkswagen’s Audi unit managed to make its 3.0-liter V6 turbodiesel engines run clean on tests, a report says, with the simple expedient of keeping emissions controls active for about 22 minutes, given that most emissions tests run no more than 20 minutes. Ergo, a clean car when it mattered most: when authorities were watching.

Service Discovery and Load balancing Internals in Docker 1.12. A great article on making this very confusing topic much less so. It covers the internals of Service Discovery and Load balancing in Docker release 1.12, DNS based load balancing, VIP based load balancing and Routing mesh. With diagrams, explanations, and what you actually type in on the command line. Nice.

Throughput vs Latency and Lock-Free vs Wait-Free: perhaps algorithm X provides better throughput than algorithm Y, but Y has a better tail latency than X. Which one is "better"? There is no right answer, it depends on how you define better: If you just want raw throughput and you don't care that 1% of the cases will take an enormous amount of time to complete, then X is better. If you're willing to sacrifice throughput to have more fairness and better (lower) latency at the tail of the latency distribution, then algorithm Y is better.

Redis and flash are a good mix. Redis for Very Large Datasets. 3 million database operations/second at under 1 millisecond of latency, while generating over 1GB NVMe throughput, on a single server with Redis on Flash and Intel NVMe-based SSDs.

This sucks. IaaS Pricing Patterns and Trends: the trend of ever downward price pressures seems to be alleviating...One interpretation of this development is that infrastructure is reaching a commodity status. Google in particular is still attempting to differentiate itself via price offerings, but generally speaking this analysis showed increased clustering of many of the providers’ offerings.

If you spent many bleary eyed hours in the airport because of Delta's computer outage here's what happened: Delta Airlines: On Second Thought, the Computer Crash Was Our Fault. Something failed and as happens so often switching to the backup also failed. That's after hundreds of millions of dollars in technology infrastructure upgrades and systems. This stuff never gets tested, so it almost always fails when put under stress. How does the public cloud look now?

Riot Games with an excellent series of articles (and video) on using Docker and Jenkins to containerize a build farm. THINKING INSIDE THE CONTAINER: DOCKERCON TALK AND THE STORY SO FAR.

Why do CPUs have multiple cache levels? What joebaf said: This is really one of the best explanations of CPU cache I’ve seen. Thanks!

Glenn Fiedler continues his excellent Building a Game Network Protocol for action games series with Reliable Ordered Messages. Great explanation along with working code. TCP isn't as good a solution because of the particular use case, which is a steady stream of 20 or 30 packets per-second sent in both directions. For 90% of the packet data it’s best to just drop the data and never resend it. Only 10% of the packets require reliability. If a packet with state for time t is lost, resending that packet isn’t particularly useful. By the time the resent packet arrives, the time t has already passed.

JVM Language Summit videos are now availlable.

Interesting way to "rollback" events that should no longer be sent. The event queue design pattern: Throughout the execution of our business code, we maintain a queue of events to be published. Upon the end of the execution flow, when we persist state, we publish all events in the queue. If during the execution of the business logic an error occur, all we have to do is throw away the queue instance and we are done. There is no need for a roll-back of the events.

JCrete 2016 videos are now available.

Lots of details and beautiful graphs. More Hash Function Tests: General cross-platform use: CityHash64 on a 64 bit system, xxHash32 on a 32 bit system; Best throughput on large data depends on platform; Best for short strings: FNV-1a.

Ligra (GitHub): a lightweight graph processing framework for shared memory. It is particularly suited for implementing parallel graph traversal algorithms where only a subset of the vertices are processed in an iteration.

The Lynx Queue: new single producer / single consumer (SP/SC) software queue that we developed for frequent inter-core communication. It’s faster than existing implementations and we call it Lynx...Lynx modifies this state-of-the-art multi-section queue by adding in what we call synchronisation red zones...When a thread tries to access one of them, the TLB catches the access and traps into the operating system, which raises a segmentation fault. We define our own signal handler that is then called and it identifies which red zone has been accessed and deals with it appropriately, including all of the synchronisation required for allowing threads to move between queue sections and preventing the them from falling off the end of the queue by rotating their pointer back to the start.

fabian-z/distkv: a distributed K/V store library for Go powered by the raft consensus algorithm.

An Optimal Bloom Filter Replacement: Our main result is a new RAM data structure that improves Bloom filters in several ways

Stuff The Internet Says On Scalability For August 12th, 2016

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale