Stuff The Internet Says On Scalability For January 29th, 2016

Hey, it's HighScalability time:


This is a trace of a Google search query. A single query might touch a couple thousand machines.If you like this Stuff then please consider supporting me on Patreon.

  • 88: the too short life of Marvin Minsky; $18.4 billion: profit made by Apple in 3 months; 100M: hours of video watched on Facebook each day; 1.59 billion: Facebook users; $115B: size of game market by 2020; 12 years: Mars rover still going strong; 96.3m: barrels of oil produced per day; 570 Billion: object brighter than the Sun; 134 pounds: carried by drones;  $2.4 billion: AWS Q4 sales; 2.5 million: advertisers on the Facebook;

  • Quotable Quotes:
    • @ptaoussanis: Real-world scaling 101: be in the habit of routinely, objectively asking what parts of your system could stand to be simplified or removed
    • @Carnage4Life: Azure revenue up 140%. Search revenue from #BingAds up 21%. Microsoft is killing it in the cloud
    • @gabriel_boya: Scaling up a Cloud Service on @azure takes so many hours that your customers may be gone by the time your instances are allocated...
    • AJ007: Facebook is the only platform that lets advertisers target a mass audience with very fine demographic precision. Google you lose the demographics. Television, you lose the the precision.
    • Junaid Anwar: It is to be noted that clustering [node.js] yielded two times the performance as compared to the non-clustering case which shows that performance linearly increases with processing cores when clustering is used.
    • crash41301: Our company has been slowly shrinking the hundreds of services we have down to a handful of larger, automated tested services and the dev team (about 50) likes it much more.
    • @swardley: Compute is the activity, Architecture is the practice
    • van lessen: Self-Contained Systems (SCS) describe an architectural approach to build software systems, e.g. to split monoliths into multiple functionally separated, yet largely autonomous web applications. 
    • R. P. Feynman: What is the cause of management's fantastic faith in the machinery?
    • Steven Max Patterson: Facebook filters much from the raw newstream and gives me what it thinks I want with about 20% accuracy.
    • Brandon Butterworth~ a single mega data centre might simply represent a single, large potential point of failure
    • boggzPit: Damn it Facebook. Why did I ever believe you could handle being cool to developers?
    • Vadim Tkachenko: To recap an interesting point in that post: when using 48 cores with the server, the result was worse than with 12 cores. I wanted to understand the reason is was true, so I started digging. My primary suspicion was that Java (I never trust Java) was not good dealing with 100GB of memory.
    • Seth Lloyd: Our algorithm shows that you don't need a big quantum computer to kick some serious topological butt...You could find the topology of simple structures on a very simple quantum computer. 
    • Robert Scoble: When he was doing his thesis 20 years ago, it took him two years to analyze just 24 hours of data from farms (he pulls in data from satellites, Doppler radar and even drones). Today, his company does the same thing in seconds.
    • @jgrahamc: Devotees of microservices use 'monolith' as a derogatory term; wait 10 years and we'll be using 'spider's web' as a derogatory term.
    • @mweagle: I see your femtoservice, and pivot with a single source code point: “yoctoservice” :) #disrupt #unicorn #M&A
    • milesrout: The entire point of Docker is that you use it for everything. It's a universal application image format. That is the point. It's contained, secure, and childproof. That is the point. It's not just about scalability. If I could use a desktop operating system where all programs ran as docker containers, I'd do that too. That's what they're for.
    • Bill Wash: I will never pass up an opportunity to help out a colleague, and I’ll remember the days before I knew everything.
    • @CarlHasselskog:  my startup handles ~10 million uploaded files/day with two employees in total (entire company). That's largely thanks to you guys.
    • AJ Kohn: December saw more negative numbers with a 6.96% decrease, year over year, in desktop search volume. Every month in 2015 had lower desktop query volume than the same month in 2014. Every. Month.
    • Jerry Chen: Every startup has a different size unit of value. Bigger is not better, smaller is not better.
    • sacundim: No, the goal of normalization is to eliminate logical inconsistencies—data sets that entail two or more different answers to the same question. 
    • Jake Archibald: Streams can be used to do fun things like turn clouds to butts, transcode MPEG to GIF, but most importantly, they can be combined with service workers to become the fastest way to serve content.
    • Solomon Hykes: Computers do run only one unikernel at a time. It’s just that sometimes they are virtual computers. Remember that virtualization is increasingly hardware-assisted, and the software parts are mature. So for many use cases it’s reasonable to separate concerns and just assume that VMs are just a special type of computer.

  • Relying on a tool backed by a big company is no protection. Facebook is closing down Parse. This is a stunner because Parse was a popular and well made service, used by millions of now adrift mobile apps. What happened? This might be it: "Facebook also would have had to invest untold millions of dollars in capital and, more importantly, engineering talent, to get the Parse business fully off the ground to have a better chance at making a dent in competitors like Amazon, Microsoft and Google." How about Firebase? The Firebase founder responds: "We're not going anywhere. What makes us different? Firebase is very complementary to Google's other product offerings. Cloud for one, as well as Angular, Polymer, GCM, etc." The moral of the store is told by bsaul: "parse wasn't a core service for facebook, nor a relevant source of a revenue AND their API wasn't standard. Those points combined made it very risky for people to use it." 

  • The Internet will soon be eating a lot of Brotli, Google's new lossless compression algorithm that is making the Internet 17-25% faster. Support will be in Chrome and other browsers, but server side support may take longer. Why does it only work with https? Richard Coles: one reason why this is limited to https is to stop it being mangled by proxies, which has been a practical problem in the past with encodings.

  • Young Skynet is continuing its dastardly plan of self-creation by seeding deep learning both far and wide. Microsoft Open Sources Deep Learning, AI Toolkit On GitHub. Twitter released Distributed learning in TorchTeach Yourself Deep Learning with TensorFlow and Udacity.

  • While the Super Bowl will make a mess of local traffic, it's great for cell phone service. Verizon spent $70 million to triple Bay Area LTE capacity ahead of the Super Bowl. They have more than tripled its 4G LTE network capacity; Build 16 new area cell sites; Install 75 small cells; Boost capacity by adding 37 XLTE to existing sites; Complete preparations to deploy 14 mobile cell sites in high traffic locations.

  • Netflix is now live in more than 190 countries. If you wondered what Netflix was doing will all that software this is what they were doing. Making an infrastructure capable of handling a world-wide roll out overnight. That's the power of zero distribution costs.

  • This takes the whole you are the product to a different level. "No Cost" License Plate Readers Are Turning Texas Police into Mobile Debt Collectors and Data Miners. What amazed the Devil most was how cheaply people sold their souls.

  • Betting on apps the assume the high density of urban areas sounds like a good bet. Michael Bloomberg: "One hundred years ago, some two out of every ten people on the planet lived in urban areas. By 1990, some four in ten did. Today, more than half of the world’s population dwells in urban areas, and by the time a child now entering primary school turns 40, nearly 70 percent will. That means that in the next few decades, about 2.5 billion more people will become metropolitan residents." And The Rise of Global Startup Cities: This uneven or spiky nature of investment and its flow to great cities marks a broader transition away from sprawling suburban campuses, or “nerdistans.” In recent years, innovation and entrepreneurship have returned to the great global cities and dense, diverse urban areas that have long served as fonts of creativity and invention.

  • Pat Helland on why Immutability Changes Everything: Designs are driving toward immutability, which is needed to coordinate at ever increasing distances. Given space to store data for a long time, immutability is affordable. Versioning provides a changing view, while the underlying data is expressed with new contents bound to a unique identifier. Copy-on-Write...Clean Replication...Immutable Data Sets...Parallelism and Fault Tolerance.

  • You think you are dog fooding your app? The Neurologist Who Hacked His Brain—And Almost Lost His Mind: In 2014, Phil Kennedy hired a neurosurgeon in Belize to implant several electrodes in his brain and then insert a set of electronic components beneath his scalp. Back at home, Kennedy used this system to record his own brain signals in a months-long battery of experiments. His goal: Crack the neural code of human speech. 

  • SDN Internet Router – Part 1 and Part 2. Spotify starts off with an excellent overview of how the Internet works. Then they perform an experiment: "The hypothesis we wanted to prove was that by analysing our traffic patterns we could lower the amount of prefixes up to the point where we could fit them into a switch. If we could prove it we could use a switch instead of a router and save a lot of money." Result: SIR [SDN Internet Router] did not only enable us to peer in several locations without having to spend money on very expensive equipment, we got also other benefits. The API that SIR provides has proven to be useful for figuring out where to send users to in order to improve latency, where and who to peer with and improve our global routing. It also gave us some vendor independence.

  • A good Recap: Docker at SCALE 14x.

  • Your drone may need not a GPS to find its way and drop cargo. Army Testing Robo-Parachutes That Don’t Need GPS. It could navigate by image recognition. 

  • Here's how Viget is planning to handle the spike load of the The Puppy Bowl Fantasy Draft. Make the users' computers do the work: render on the client. Don't do the same work twice: cache responses. Use professional data infrastructure services: they are using Firebase. 

  • Here's how Apple synchronized all your Apple Watches. It syncs using NTP as you might imagine. Apple has their own network of 15 Stratum One Network Time Servers around the world. The watch has a crystal temperature-control oscillator to manage the vagaries of extreme temperatures, to compensate for drift. What you may not suspect is the watch keeps time four times more accurately than the iphone. And how they test the accuracy seems so very Appley: Apple actually tests that accuracy with high-speed cameras that watch, frame-by-frame, as the Apple Watch second hand moves around

  • Packet Pushers on SDN evolution. Networking has been consumed into software on the edge. In the early days of server virtualization it seemed like it would be a zero sum game, but it turned out the more VMs the more servers had to be bought. There are a lot of similarities between network and compute virtualization. Network virtualization is about as general as compute virtualization. What we are seeing is a lot more technology being consumed and used a lot more interesting ways. A lot of net value is being created. Server virtualization removed a lot of friction, not in just one place, but dozens of places. It made deployment easier, for example. In network virtualization if you don't have to manipulate the physical routing table is devices to make a change to the network then the whole financial math around operations changes in a completely different way. Friction in networking is very real, there's a cost to every change. 

  • Is really still a market if there are no humans? High-Speed Firms Now Oversee Almost All Stocks at NYSE Floor.

  • The future is hard to make. 3D XPoint Steps Into the Light: It could take 12-18 months to get XPoint into mass production...3D Xpoint uses as many as 100 new materials, raising supply chain issues...The unique vertical designs of XPoint and 3D NAND require more machines running process steps...The extra gear could drive 3-5x increases in capital expenses and space needed. Is it worth waiting for? XPoint chips can deliver more than 95,000 I/O operations per second at a 9 microsecond latency, compared to 13,400 IOPs and 73 ms latency for flash...A version of XPoint in DIMMs will enable up to 6 TBytes main memory in a two-socket Xeon server at about half the cost of DRAM.

  • If you are looking for a relatively concise explanation of bitcoin's scaling problems then The Governance of Anarchists :: Blockchain Letter, January 2016 is an excellent read. As they say: The governance of anarchists is more difficult than it sounds.

  • sacundim: I've grown disappointed with the word "immutability." People have overfixated on the word at the expense of the concepts. Let me lay out a range of concepts here: Mutable reference cells: "Variables" in the imperative sense, which support a "write" operation that overwrites the old value. Log-based systems: Systems that support a "write" operation that appends to the existing data. See:Kafka. Reactive systems: Systems that are built around some concept of read-only changing values. The key concept is functional transformations of immutable value streams (with operations like map, reduce, join, etc.). See: Spark Streaming, Flink, Samza, Storm. Immutable data structures: Data types that represent pure, unchanging values. If you want to "change" something you create a whole new thing anew.

  • If you are thinking about using PHP for your next project here's a good thread on reddit about 12 Reasons to Choose PHP for Developing Website in 2016. A good discussion of both the pros and the cons. 

  • Adventures in High Speed Networking on Azure: So far, we believe this more than validates that we are not compromising performance by running in Azure, using Windows or managed C# to get both the best performance and also the most developer productivity...6.8M requests per second (6,807,542). Latency of 92.ms (avg), 113.81 ms (std-dev), 1.27s (max), 50% < 43ms, 90% < 288ms. 

  • Akamai has a technology called Giga, a replacement for TCP, that can move data on average 30 percent faster than TCP. Upgrade to Core Internet Protocol Can Boost Speeds 30 Percent. Tests in India, China, and Bolivia showed improvements of more than 150 percent. Replacing TCP is a tall order, but if there's anyone that could benefit from a faster protocol it's Akamai, and that makes a difference.

  • How I ended up paying $150 for a single 60GB download from Amazon Glacier. Good take away by Joe: My point should be fairly obvious. The “low low prices” are for very specific use cases, designed specifically to pull you in, and make it expensive for you to leave.

  • Class has started. And it doesn't get better than this epic post by Tyler Akidau, a staff software engineer at Google: The world beyond batch: Streaming 102. Just incredible coverage of a complex topic. Now you'll need to know about Apache Beam: an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs).

  • Don't let your babies grow up to be publishers. franze sings why: ok, just to get this straight if a publisher want to do everything right in todays internet, they need  * a responsive website  * with views for desktop, mobile and tablet * optimized for search, social and conversion* optional: augmented with schema.org * an iphone app (one or more) * an android app  * optional: tablet/ipad app  * facebook channel * twitter channel * youtube channel * pinterest presence * whatsapp presence * snapchat presence * one or more newsletter * constant A/B testing now add * facebook instant articles  * google amp pages

  • Sort Faster With FPGAs: we can sort in linear time, i.e a running time of O(N). There’s a catch, though: to achieve linear time, we’ll need to build some custom hardware to help us out. In this post, I’ll unfold the problem of sorting in parallel, and then I”ll take us through a linear-time solution that we can synthesize at home on an FPGA.

  • How Far Can You Push ejabberd? 1 Node — 2+ Million Concurrent Users. The test used 50 connections/second and thus 550 logins per second. Tsung was used as to drive the load. The node was a single instance type m4.10xlarge (40 vCPU, 160 GiB). The traffic was handled with a memory footprint of 28KB per online user. The 40 CPUs were almost evenly used. The hardest part? Tuning the Linux system. Keep in mind ejabberd is also used by WhatsApp.

  • Brendan Gregg lists a lot of Broken Linux Performance Tools 2016,

  • Brad Hein on The Rising Sophistication of Network Scanning: In this article I would like to show you a hidden system that is hard at work scanning thousands, maybe millions, of unsuspecting devices. And I'll show how this system efficiently harvests each device's personal IP address and hands it off to a scanner, which proceeds to run a port/security scan against each unsuspecting victim for vulnerabilities.

  • The Epic Fail of Hollywood's Hottest Algorithm. Fascinating story of using algorithm washing to scam Hollywood. It apparently thought Paranoia and Out of the Furnace would be hits.

  • It's a world of signals. Bluetooth and Wi-Fi sensing from mobile devices may help improve bus service: Let's say you have a Husky game or Seahawks game and you want to know how much demand changes so you can offer the right level of bus service for this special event," says study senior author Yinhai Wang. "If you can gather enough data from these real-time sensing systems, that's going to offer very valuable information."

  • How Zano Raised Millions on Kickstarter and Left Most Backers with Nothing. A very long meditation on why it's hard to build stuff that works, even if you have a lot of money. A very good lesson: avoid relying exclusively on the technical abilities of a single person.

  • Nice example of Building a serverless anagram solver with AWS (DynamoDB, Lambda, S3, CloudFront and API gateway)

  • Goad: an AWS Lambda powered, highly distributed, load testing tool built in Go.

  • Gizmo: A Microservice Toolkit from The New York Times.

  • greta.io: This is our implementation of Bimodal Multicast over webRTC, which we use to broadcast information in our peer-to-peer networks. The Gossip Protocol allows for a message to be broadcasted from one peer to the rest of the peers that it is connected to, after which those peers start to gossip. 

  • OneOps: a cloud management and application lifecycle management platform that developers can use to both develop and launch new products faster, and more easily maintain them throughout their entire lifecycle. (from WalMart)

  • Existential consistency: measuring and understanding consistency at Facebook: our analysis shows that 0.0004% of reads to vertices would return different results in a linearizable system. This in turn gives insight into the benefits of stronger consistency; 0.0004% of reads are potential anomalies that a linearizable system would prevent. We directly study local consistency models---i.e., those we can analyze using requests to a sample of objects---and use the relationships between models to infer bounds on the others.

  • Yesquel: scalable sql storage for web applications:  Yesquel has a new architecture and a new distributed data structure, called YDBT, which Yesquel uses for storage, and which performs well under contention by many concurrent clients. We evaluate Yesquel and find that Yesquel performs almost as well as Redis---a popular nosql system---and much better than mysql Cluster, while handling sql queries at scale.

  • Software defined batteries: we present a new hardware-software system, called Software Defined Battery (SDB), which allows system designers to integrate batteries of different chemistries. SDB exposes APIs to the operating system which control the amount of charge flowing in and out of each battery, enabling it to dynamically trade one battery property for another depending on Application And/Or User Needs. 

  • Fast in-memory transaction processing using RDMA and HTM: We present DrTM, a fast in-memory transaction processing system that exploits advanced hardware features (i.e., RDMA and HTM) to improve latency and throughput by over one order of magnitude compared to state-of-the-art distributed transaction systems. Evaluation using typical OLTP workloads including TPC-C and SmallBank show that DrTM scales well on a 6-node cluster and achieves over 5.52 and 138 million transactions per second for TPC-C and SmallBank Respectively. This number outperforms a state-of-the-art distributed transaction system (namely Calvin) by at least 17.9X for TPC-C.

  • Greg Linden is back with another set of Quick Links.