hot links

Stuff The Internet Says On Scalability For January 8th, 2016

High Scalability

08 Jan 2016 — 11 min read

Hey, it's HighScalability time:

Finally, a clear diagram of Amazon's industry impact. (MARK A. GARLICK)If you like this Stuff then please consider supporting me on Patreon.

150: # of globular clusters in the Milky Way; 800 million: Facebook Messenger users; 180,000: high-res images of the past; 1 exaflops: 1 million trillion floating-point operations per second; 10%: of Google's traffic is now IPv6; 100 milliseconds: time it takes to remember; 35: percent of all US Internet traffic used by Netflix; 125 million: hours of content delivered each day by Netflix's CDN;

Quotable Quotes:
- Erik DeBenedictis: We could build an exascale computer today, but we might need a nuclear reactor to power it
- wstrange: What I really wish the cloud providers would do is reduce network egress costs. They seem insanely expensive when compared to dedicated servers.
- rachellaw: What's fascinating is the bot-bandwagon is mirroring the early app market.
  With apps, you downloaded things to do things. With bots, you integrate them into things, so they'll do it for you.
- erichocean: The situation we're in today with RAM is pretty much the identical situation with the disks of yore.
- @bernardgolden: @Netflix will spend 2X what HBO does on programming in 2016? That's an amazing stat.
- @saschasegan: Huawei's new LTE modem has 18 LTE bands. Qualcomm's dominance of LTE is really ending this year.
- Unruly Places: The rise of placelessness, on top of the sense that the whole planet is now minutely known and surveilled, has given this dissatisfaction a radical edge, creating an appetite to find places that are off the map and that are somehow secret, or at least have the power to surprise us.
- @mjpt777: Queues are everywhere. Recognise them, make them first class, model and monitor them for telemetry.
- Guido de Croon: the robot exploits the impending instability of its control system to perceive distances. This could be used to determine when to switch off its propellers during landing, for instance.
- @gaberivera: In the future, all major policy questions will be settled by Twitter debates between venture capitalists
- Craig McLuckie: It’s not obvious until you start to actually try to run massive numbers of services that you experience an incredible productivity that containers bring
- Brian Kirsch: One of the biggest things when you look at the benefits of container-based virtualization is its ability to squeeze more and more things onto a single piece of hardware for cost savings. While that is good for budgets, it is excessively horrible when things go bad.
- @RichardWarburto: It still surprises me that configuration is most popular user of strong consistency models atm. Is config more important than data
- @jamesurquhart: Five years ago I predicted CFO would stop complaining about up front cost, and start asking to reduce monthly bill. Seeing that happen now.
- @martinkl: Communities in a nutshell… • Databases research: “In fsync we trust” • Distributed systems research: “In majority vote we trust”
- @BoingBoing: Tax havens hold $7.6 trillion; 8% of world's total wealth
- @DrQz: Amazon's actual profits are still tiny, relying heavily on its AWS cloud business.
- hadagribble: we need to view fast storage as something other than disk behind a block interface and slow memory, especially with all the different flavours of fast persistent storage that seem to be on the horizon. For the one's that attach to the memory bus, the PMFS-style [1] approach of treating them like a file-system for discoverability and then mmaping to allow them to be accessed as memory is pretty attractive.

EC2 with a 5% price reduction on certain things in certain places. Not exactly the race to the bottom one would hope for in a commodity market, which means the cloud is not a commodity. Happy New Year – EC2 Price Reduction (C4, M4, and R3 Instances).

Since the locus of the Internet is centering on a command line interface in the form of messaging, chatbot integrations may be giving APIs a second life, assuming they are let inside the walled garden. The next big thing in computing is called 'ChatOps,' and it's already happening inside Slack. The advantage chatops has over the old Web + API mashup dream is that messaging platforms come built-in with a business model/app store, large and growing user base, and network effects. Facebook’s Secret Chat SDK Lets Developers Build Messenger Bots. Slack apps. WeChat API. Telegram API. Alexa API. Google's Voice Actions. How about Siri or iMessage? Nope. njovin likes it: I've worked with the new Chat SDK and our customers' use cases aren't geared toward forcing (or even encouraging) users into using Facebook Messenger. Most of them are just trying to meet demand from their customers. In our particular case, we have customers with a lot of international travelers who have access to data while abroad but not necessarily SMS. IMO it's a lot better than having a dedicated app you have to download to interact with a specific brand.

The world watched a lot of porn this year. If you like analytics you'll love Pornhub’s 2015 Year in Review: In 2015 alone, we streamed 75GB of data a second; bandwidth used is 1,892 petabytes; 4,392,486,580 hours of video were watched; 21.2 billion visits.

A very interesting way to frame the issue. On the dangers of a blockchain monoculture: The Bitcoin blockchain: the world’s worst database. Would you use a database with these features? Uses approximately the same amount of electricity as could power an average American household for a day per transaction. Supports 3 transactions / second across a global network with millions of CPUs/purpose-built ASICs. Takes over 10 minutes to “commit” a transaction. Doesn’t acknowledge accepted writes: requires you read your writes, but at any given time you may be on a blockchain fork, meaning your write might not actually make it into the “winning” fork of the blockchain (and no, just making it into the mempool doesn’t count). In other words: “blockchain technology” cannot by definition tell you if a given write is ever accepted/committed except by reading it out of the blockchain itself (and even then). Can only be used as a transaction ledger denominated in a single currency, or to store/timestamp a maximum of 80 bytes per transaction. But it’s decentralized!

If you are looking for a well informed discussion on distributed architectures for the future then the folks at IPFS (InterPlanetary File System), a new hypermedia distribution protocol, are your people. This is great: Aggregation --> CRDTs discussion.

Non-volatile Storage Implications of the Datacenter's Shifting Center: The arrival of high-speed, non-volatile storage devices, typically referred to as Storage Class Memories (SCM), is likely the most significant architectural change that datacenter and software designers will face in the foreseeable future... The age-old assumption that I/O is slow and computation is fast is no longer true... The relative performance of layers in systems has changed by a factor of a thousand times over a very short time... Piles of existing enterprise datacenter infrastructure—hardware and software—are about to become useless... Our sense is that this emerging set of nonvolatile memories is initially resulting in software systems that are far less efficient than the disk-based systems that they are replacing.

Leo Laporte recounts the awesome internet connectivity he had on his cruise ship. What they did is dedicate a satellite to track the cruise ship wherever it went. o3bnetworks says they have latencies of less than 150 milliseconds from medium earth orbit satellites. So how about each of us have our private satellite up in space? The satellites can route calls between themselves with no outside interface at all. With these new tiny satellites it might possible? Maybe it could even use a Quantum Internet.

You might be interested in the Microservices Practitioner Summit in San Francisco on 1/27. The event will feature microservices devs from Uber, Netflix, Yelp, and a few others.

Some helpful advice from bostik: If you have a workload that is light on CPU but needier on RAM, m4.large is cheaper than m3.large; m3.medium may not do as it has less than 4GB. Also, in our experience t2.* instances can't sustain any reasonable network traffic, but they do work wonderfully for lightweight RESTful systems. So any workload which serves mostly cached data and needs 6-7 GB per node is best off with m4.large. At least for Ireland, in our experience a single m4.large can keep up with bursts of ~120Mbps and sustain around 65Mbps. We have two as edge nodes for one of our public services, and will probably add a third one soon. Cutoff point for sustained bandwidth is slightly above 70Mbps, after that it starts to stutter. The t2.* instances choke and throttle bandwidth way earlier.

There's not much available on how Slack works. Here's a high level disagram of their architecture on AWS. And a tweet: Java messaging server + LAMP for core app/APIs + JS client + MacGap + native mobile (Objective-C for iOS/Java for Android). And for chat: Java messaging server and LAMP for core app/APIs. The Java messaging server is custom built.

Ben Stopford with a bit of contrarian wisdom. Does In-Memory Really Make Sense? So memory optimised is good. Memory optimised is fast. But the downsides of the hard limit imposed by pure in-memory solutions is often not worth the operational burden, especially when disk backed solutions, provided ample memory to use for caching, perform equally well for all but the most specialised, data intensive use cases.

Videos for LambdaConf 2015 are now available.

Isn't this how flocks of birds work? How Drones May Avoid Collisions by Sharing Knowledge: The Stanford researchers found that drones could make the quickest decisions when they were paired with the closest other drone, and the two solely considered the other’s behavior. The slowest response occurred when drones considered their own surroundings and then fed their results into a central system that sent decisions back to the entire group. Decision time always increased as more drones entered the simulation, but the system was always able to make a decision on rerouting a drone within 50 milliseconds.

Good explanation of Using CTEs and Unions to Compute Running Totals.

How long does it take to make a context switch?: Context switching is expensive. My rule of thumb is that it'll cost you about 30µs of CPU overhead. This seems to be a good worst-case approximation. Applications that create too many threads that are constantly fighting for CPU time (such as Apache's HTTPd or many Java applications) can waste considerable amounts of CPU cycles just to switch back and forth between different threads. I think the sweet spot for optimal CPU use is to have the same number of worker threads as there are hardware threads, and write code in an asynchronous / non-blocking fashion.

The Oculus Rift was supposed to be cheaper — so what happened? They made it better instead of cheaper. Best reason ever.

For an efficient means of monitoring an application without generating a lot of garbage or costing much to record, take a look at the system counters in Aeron and AeronStat.

802-11ah is a Bluetooth competitor that could be big for IoT: large scale sensor networks, extended range hotspot, and outdoor Wi-Fi for cellular traffic offloading. What's different is it is "a new PHY and MAC design that operates in the sub-one-gigahertz (900MHz) band... optimized from the ground up for extended range, power efficiency, and scalable operation...extends the range of Wi-Fi beyond the limited range of 2.4 and 5 GHz by leveraging the improved propagation and penetration of 900MHz radio waves through walls and obstructions... A single 11ah AP can provide whole home coverage. It can also support low cost battery powered sensors operating without a power amplifier... A 150 Kbps minimum data rate results in short on-time for sensors with short bursty data packets thus lowering their power consumption... MAC is also optimized to scale to thousands of nodes by using efficient paging and scheduled transmission." Given cost will have a big impact on adoption I was surprised to see no mention of it.

Great interview with Heroku co-founder Adam Wiggins on Building SaaS apps that scale, and building great teams: The Twelve-Factor App is a set of first principles for application development specifically designed to make it maintainable and especially deployable to a lot of different targets, especially cloud targets.

The Last Fighter Pilot: Hierlmeier, flanked by a pair of Lockheed Martin contractors and an Air Force PR person tapping her smartphone, leans on the cockpit and considers that future. “I don’t want to be the horse cavalry guy at the start of World War I,” he says. “I’m hoping we’ll see a day when man is not in the machine, in the jet, but man is in the loop. We’ve got to embrace that. I see a day when you’re driving into this dome, and you’re fighting the fight from right here.”

Getting rid of state machines. Not sure about this: "Why state machines? Why not processes or threads?" State machines are orthoganal to threads. Code executing in the context of a thread is governed by an implicit or explicit state machine. And a state machine can easily control work done on other threads using events.

The future is distributing itself. Bloomberg Trades Static Clusters For Homegrown Mesos: The use of Mesos underneath the BVault service is still fairly new, being only a few months old, but Bloomberg is already seeing that allowing for the complex event processing, text search, and analytics and reporting workloads to share the cluster is cutting back on the amount of servers it needs to buy to support the expanding customer base for BVault; over time, Gupta hopes the number of machines needed to run BVault “will be significantly reduced.” How much, he cannot yet say. The static BVault cluster, on average, ran at 65 percent to 70 percent utilization, and Bloomberg thinks it will easily beat this on the Mesos setup.

This discussion of Disque vs RabbitMQ is a great example of what happens when people keep their cool and talk the tech. We all learn something. Nice job.

The pros and cons of Mesos and Kubernetes: There's a lot more to say about the two and there's no clear winner in the fight for the best cluster manager.

Good survery of Building Simple Recommender Systems for Elasticsearch. You can build a User-Item Recommender, an Item-Item Recommender, and integrate with a scalable machine learning framework like Mahout.

Transcend announces ‘SuperMLC’ as an SLC NAND alternative: What Transcend is doing is binning high-quality MLC, then treating it like SLC. In theory, this provides 4x the write performance and up to 30,000 program/erase cycles. That’s not identical to what the graph above illustrates, but it’s close enough for our purposes — specialized MLC-as-SLC can offer better performance and endurance than standard MLC, but at lower costs.

He liked it, he really liked it. Jepsen: RethinkDB 2.1.5: As far as I can ascertain, RethinkDB’s safety claims are accurate. You can lose updates if you write with anything less than majority, and see assorted read anomalies with single or outdated reads, but majority/majority appears linearizable.

Don't say the guberment never done nothin' for ya. DDoS Quick Guide.

It's not enough to have a lot users, your interaction with the users must also be natively monitizable or having users doesn't pay off. Dare Obasanjo lays it out: Snapchat’s valuation is based on a single flawed assumption: What Reddit has found out the hard way is that their advertising doesn’t fit natively into their platform.Their ads often don’t match the form of the content and when it does, it doesn’t match user intent for what they want out of Reddit.

Some good answers to this question on Quora: I need create a database containing a log of operations in my system. What is the best option to storage 30 million rows and the select to be fast? The ELK stack seems to be the winner. E-lasticSearch L-ogstash K-ibana.

Zopfli Optimization: Literally Free Bandwidth: If you work on a project that serves compressed assets, take a close look at Zopfli. It's not a silver bullet – as with all advice, run the tests on your files and see – but it's about as close as it gets to literally free bandwidth in our line of work.

Interesting, this is also how you can find bugs using a source code control system, revert changes and see what happens. The 6 Billion Letters Of Our Genome: Now we’ve recently seen how we can systematically delete genes to find out which are essential for life. From that we learned that only about 1600 (8%) of the nearly 19,000 human genes are truly essential.

Versionable, Branchable, and Mergeable Application State: We describe the design of a VC system named VERCAST that provides fine-grained control over the consistency model used in maintaining application state.

Stuff The Internet Says On Scalability For January 8th, 2016

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale