hot links

Stuff The Internet Says On Scalability For November 13th, 2015

High Scalability

13 Nov 2015 — 11 min read

Hey, it's HighScalability time:

Gorgeous picture of where microbes live in species. Humans have the most. (M. WARDEH ET AL)

14.3 billion: Alibaba single day sales; 1.55 billion: Facebook monthly active users; 6 billion: Snapchat video views per day; unlimited: now defined as 300 GB by Comcast; 80km: circumference of China's proposed supercolider; 500: alien worlds visualized; 50: future sensors per acre on farms; 1 million: Instagram requests per second.

Quotable Quotes:
- Adam Savage~ Lesson learned: do not test fire rockets indoors.
- dave_sullivan: I'm going to say something unpopular, but horizontally-scaled deep learning is overkill for most applications. Can anyone here present a use case where they have personally needed horizontal scaling because a Titan X couldn't fit what they were trying to do?
- @bcantrill: Question I've been posing at #KubeCon: are we near Peak Confusion in the container space? Consensus: no -- confusion still accelerating!
- @PeterGleick: When I was born, CO2 levels were ~300 ppm. This week may be the last time anyone alive will see less than 400 ppm.
- @patio11: "So I'm clear on this: our business is to employ people who can't actually do worthwhile work, train them up, then hand to competition?"
- Settlement-Size: This finding reveals that incipient forms of hierarchical settlement structure may have preceded socioeconomic complexity in human societies
- wingolog: for a project to be technically cohesive, it needs to be socially cohesive as well; anything else is magical thinking.
- @mjpt777: Damn! @toddlmontgomery has got Aeron C++ IPC to go at over 30m msg/sec. Java is struggling to keep up.
- Tim O'Reilly: While technological unemployment is a real phenomenon, I think it's far more important to look at the financial incentives we've put in place for companies to cut workers and the cost of labor. If you're a public company whose management compensation is tied your stock price, it's easy to make short term decisions that are good for your pocketbook but bad long term for both the company and for society as a whole.
- @RichardDawkins: Evolution is "Descent with modification". Languages, computers and fashions evolve. Solar systems, mountains and embryos don't. They develop
- @Grady_Booch: Dispatches from a programmer in the year 2065: "How do you expect me to fit 'Hello, World' into only a terabyte of memory?" via Joe Marasco
- @huntchr: I find #Zookeeper to be the Achilles Heal of a few otherwise interesting projects e.g. #kafka, #mesos.
- Robert Scoble~ Facebook Live was bringing 10x more viewers than Twitter/Periscope
- cryptoz: I've always wondered about this. Presumably the people leading big oil companies are not dumb idiots; so why wouldn't they take this knowledge and prepare in advance? Exxon could be the leading global provider of renewable energy right now, set to dominate the industry for a century or more. But instead, they are a crumbling company leading a death march of society and their own bank accounts. Why? Why would a big company do this?

Waze is using data from sources you may not expect. Robert Scoble: How about Waze? I witnessed an accident one day on the highway near my house. Two lane road. The map turned red within 30 seconds of the accident. How did that happen? Well, it turns out cell phone companies (Verizon, in particular, in the United States) gather real time data from cell phones. Your phone knows how fast it’s going. In fact, today, Waze shows you that it knows. Verizon sells that data (anonymized) to Google, which then uses that data to put the red line on your map.

If email would have been done really right in the early days then we wouldn't need half the social networks or messaging apps we have today. Almost everything we see is a reimplementation of email. Gmail, We Need To Talk.

Don Norman and Bruce Tognazzini, prophets from Apple's time in the wilderness, don't much like the new religion. They stand before the temple shaking fists at blasphemy. How Apple Is Giving Design A Bad Name: Apple is destroying design. Worse, it is revitalizing the old belief that design is only about making things look pretty. No, not so! Design is a way of thinking, of determining people’s true, underlying needs, and then delivering products and services that help them. Design combines an understanding of people, technology, society, and business.

There's a new vision of the Internet out there and it's built around the idea of Named Data Networking (NDN). It's an evolution from today’s host-centric network architecture IP to a data-centric network architecture. Luminaries like Van Jacobson like the idea. Packet Pushers with good coverage in Show 262 – Future of Networking – Dave Ward. Dave Ward is the CTO of Engineering and Chief Architect at Cisco. For me, make the pipes dumb, fast, and secure. Everything else is emergent.

Are flash interest groups built around brands a good thing? Airbnb and Uber Mobilize Vast User Base to Sway Policy.

At the highest level, technology companies are really recruiting companies in disguise. Alphabet’s, Facebook’s Best Engineers: The Challenge of Scaling the Biz, Per Bernstein: Alphabet’s best engineers are more likely to leave the company when its stock does not perform well, leading us to believe that management care about the stock price...We believe that talent retention may have been part of the reason why Amazon decided to disclose more information about AWS starting last January...as companies scale up the number of engineers, it is challenging for them to maintain the average quality...companies that do not become one of these large scale players (which some might call “platform companies”), and focus on narrower markets, such as PayPal or Netflix, will be always subject to the competitive threat from the giants.

Here's how Instagram scaled their infrastructure to multiple datacenters. Motivations: Resilience to regional issues; Flexible capacity expansion. The key: distinguish global data and local data. Global data needs to be replicated across data centers, while local data can be different for each region (for example, the async jobs created by web server would only be viewed in that region). The result: we recently survived a staged “disaster.” Facebook regularly tests its data centers by shutting them down during peak hours.

Should you stand up your own BigData cluster or pay by the query? It depends. This discussion was sparked by @rbranson: $20/mo to store 1TB in BigQuery. $5 to query it. once. damn.

Google may have released their TensorFlow framework, but there are still a few things you'll need before you can go AI Native. You'll need a gigantically huge corpus of user generated training data, the actual ML models, and a stable of PhDs to make sense of it all. Benchmark TensorFlow. TensorFlow-Examples. Jeff Dean explains TensorFlow. Some items that caught Delip Rao's eye from the TensorFlow paper: Apache 2.0 license; Google quality code; Reasonable docs; Out of the box multiple device and distributed execution; Fancy placement algorithms for multi-node scheduling; Autograds (like Theano and Torch); Fault tolerance & checkpointing; Finer grained control on concurrency; Support for multiple devices (mobile to GPU arrays) and multiple language interop; Fancypants optimizations of the computational graph; TensorBoard tool to visualize computation graphs and monitor network parameters during training.

Pete Warden talks about pushing intelligence out to the edge, to the level of the individual sensor in Semantic Sensors and bespoke frameworks, so neural networks can run on low-power embedded devices. Google already does some of this with their compact voice recognition models that run on smart phones.

Gile Tene with wisdom on combining HFTs and microservices: Regardless of your choice of transport, the main tension point between using micro-services and minimizing latency has to do with the number of handoffs involved in whatever end-to-end path you'll be taking. The most effective means for cutting down end-to-end latencies tends to be minimizing the overall number of "hops" involved. Most systems end up using the minimal number of hops actually required by their business needs (e.g. protocol gateways, persistence/journaling/replication, and load bearing/distribution needs), keeping as much of each logic step as possible to a single process and thread. Parallelizing actually parallelizeable paths is usually secondly to this first "keep number of hops small" step.

Google may not be able to compete with AWS on the number of variety of services offered, but there's no doubt it can compete on Speed, Scaling And Authentication.

Using finite state machines as a data structure. Index 1,600,000,000 Keys with Automata and Rust. Epic article on using finite state machines to compactly represent ordered sets and maps of strings that can be searched very quickly.

Under the Radar is new developer podcast you might be interested in. The hosts are David Smith and Marco Arment. The first episode has a good discussion on the money making opportunities with iOS. A lot of programmers are writing apps now, so economically speaking it's obviously going to be harder to make a living. If you are not the kind of person who can survive in an ultracompetitive low priced market then maybe you should do something else? David Smith, developer of Pedometer++, takes a different approach. He makes simpler quicker to produce apps. David says his most consistent revenue source is using iAds, which is surprising and interesting. Don't overvalue your own work is the advice. A lot of apps are useful, but not valuable to people, so ads are reasonable approach to making money. There's an opportunity with the Apple TV right now, but that opportunity may vanish next Christmas as more apps flood into the store. Like the Apple Watch, Apple TV may be a value add to an app rather than a money maker in the long run, for people who need a little push to buy an app. Let me interject a note here that there's an interesting parallel here with the Amazon book store where the advice is to make more smaller publications in the hopes that one will hit. Spending two years on book or on an app that may not pay off may not be the best strategy. Life in the commodity content pits is a struggle.

What is the next scale ecosystem? Mobile is already done. Mobile, ecosystems and the death of PCs: One of the ways that tech progresses is in generational changes in scale. We had mainframes, then minicomputers, then workstations and PCs, and now mobile, and each generation brings a step change in scale. That scale means that it becomes the new ecosystem and the new centre for innovation. iOS and Android smartphones alone are now outselling PCs 5:1, not even counting tablets, and that will rise to closer to 10:1 in the next few years. So, this is the new scale ecosystem.

Looking like the storage phalanx of Palpatine's Red Guard, Backblaze breaks down The Hardware Inside B2 Cloud Storage – Storage Pod 5.0: "The entire system, fully populated with 180TB worth of hard drives, costs Backblaze just over $0.044 per GB, and that includes assembly costs (labor)." In a notable twist Backblaze built their new pod using an Agile + Scrum methodology and were quite happy with the results.

Brent Ozar answers: Is My SQL Server Too Big for Virtualization or the Cloud?: start with the lowest risk, easiest-to-manage servers first. Learn your lessons on smaller servers, then gradually use the technology on larger and larger servers.

There's a very energetic background conversation going on about securing bitcoin's future by figuring out how to scale bitcoin. Here's a weekly Scaling Bitcoin thread that might be of interest.

Comparing Message Queue Architectures on AWS: If you are light on DevOps and not latency sensitive use SQS for job management and Kinesis for event stream processing. If latency is an issue, use ELB or 2 RabbitMQs (or 2 beanstalkds) for job management and Redis for event stream processing.

Videos from RICON 2015 - San Francisco are becoming available. Lots of good stuff. Basho brings the big brains to think about distributed systems.

This is cool. bitdrones: interactive flying microbots show future of virtual reality is physical.

The impact of Docker containers on the performance of genomic pipelines: Docker containers have only a minor impact on the performance of common genomic pipelines, which is negligible when the executed jobs are long in terms of computational time.

An Updated Performance Comparison of Virtual Machines and Linux Containers: Our results show that containers result in equal or better performance than VM in almost all cases. Both VMs and containers require tuning to support I/O-intensive applications. We also discuss the implications of our performance results for future cloud architecture

Peter Garritano focuses on interior datacenter landscapes with this uncommon picture series.

OK, I have to admit my first thought about this was vampires sucking away the heat. IBM is trying to solve all of computing’s scaling issues with 5D electronic blood.

League of Legends requires millisecond latencies, but the Internet isn't designed for that. FIXING THE INTERNET FOR REAL TIME APPLICATIONS: PART I: The internet is not a single unified system, but rather a conglomeration of multiple entities...Backbone providers and ISPs route traffic to the lowest cost path, not the lowest latency path...A direct trip may have taken 14ms, but the less efficient route takes a full 70ms. That’s a brutal 500% increase...In a game of LoL, however, where the future can’t be buffered since it hasn’t yet been played, those five seconds are utterly unacceptable...If a router gets more packets than it can handle, it’s forced to simply drop them...when the actual route processors of internet routers begin to become overwhelmed by traffic, many simply start ignoring UDP packets...With all this in mind, consider how the circuitous routing resulting from BGP results in many more routers being involved in a single trip.

Videos are available from Twitter Flight 2015.

You might be interested in the Microservices Weekly newsletter.

Varnish has an antidote to cache poisoning. Junk Junk Junk

A look back at the State of the Cloud, and a few new predictions for 2015: And finally, for me, the most interesting new technology in the last year is AWS Lambda. It is a highly secure, event-driven computing model that creates a new container to process each event...I think that eventually this model will become a best practice to protect the most critical data, and as data centers keep getting hacked, more people will realize that state-of-the-art, highly-secure systems should built using cloud.

This is why we can't have nice things. Telize closed down because it was being used by malware and ransomware.

Murat with a good trip report on the High Performance Transaction Systems (HPTS) workshop.

A pretty good list. 102 performance engineering questions every software development team should ask.

Microkernels Meet Recursive Virtual Machines: This paper describes a novel approach to providing modular and extensible operating system functionality and encapsulated environments based on a synthesis of microkernel and virtual machine concepts. Fluke.

Here are IBM's 11 tips for scaling servers in the cloud: Set up load balancing; Keep different environments looking the same; Use stateless servers; Stop your servers often; Zero in on bottlenecks; Run background tasks; Cache what you can; Set up autoscaling; Involve your whole team; Make time for testing; Consider containers.

UsenetDHT: A Low-Overhead Design for Usenet: Usenet is a popular distributed messaging and file sharing service: servers in Usenet flood articles over an overlay network to fully replicate articles across all servers. However, replication of Usenet's full content requires that each server pay the cost of receiving (and storing) over 1 Tbyte/day. This paper presents the design and implementation of UsenetDHT, a Usenet system that allows a set of cooperating sites to keep a shared, distributed copy of Usenet articles.

Leaf: a open source framework for machine intelligence, sharing concepts from TensorFlow and Caffe.

THERMAL-JOIN: A Scalable Spatial Join for Dynamic Workloads: a novel spatial self-join algorithm for dynamic memory-resident workloads. The algorithm groups objects in spatial proximity together into hot spots.

libmill: a library that introduces Go-style concurrency to C. It can execute up to 20 million coroutines and 50 million context switches per second.

Stuff The Internet Says On Scalability For November 13th, 2015

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale