hot links

Stuff The Internet Says On Scalability For May 9th, 2014

High Scalability

09 May 2014 — 6 min read

Hey, it's HighScalability time:

NASA captures Guatemala volcano erupting from space

40,000 exabytes: from now until 2020, the digital universe will about double every two years; $650,000: amount raised by the MaydayPAC in one week.
Quotable Quotes:
- @BenedictEvans: Masayoshi Son: $20m initial investment in Alibaba, current stake worth $58bn.
- @iamdevloper: I sneezed earlier and Siri compiled it to valid Perl.
- @cdixon: "There is not enough competition in the last mile market to allow a true market to function"
- @PatrickMcFadin: Get ready for some serious server density. AMD is working on K12, brand-new x86 and ARM cores. This plus 8T SSD?

With age comes changing priorities. Facebook is now 10 and has grown up. They are no longer moving fast and breaking things. They are now into the stability thing. Letting developers know they are a stable platform. The play is to get all that beautiful data from developers by being the platform for the Internet. On which an ad platform is built like a castle protecting a river valley. Interesting that Twitter said No! to becoming a platform, turning away developers. What has happened to Twitter's growth? The thought processes that lead to such different conclusions about the future would be interesting to understand.

Better than a Tauntaun roasted over an open light saber. An ode to 17 database in 33 minutes - RailsConf2014 by tobyhede. My favorite is "MySQL - The same as PostgreSQL but controlled by an evil overlord."

Well, when you explain it that way...why GNU grep is fast: GNU grep is fast because it AVOIDS LOOKING AT EVERY INPUT BYTE; GNU grep is fast because it EXECUTES VERY FEW INSTRUCTIONS FOR EACH BYTE that it *does* look at.

How Gilt's Insane Traffic Spikes Pushed It Off Rails To Scala. It's unusual to have your expected traffic pattern to be a 100x spike once a day for 15 minutes, but that's the life of flash sales. Started as a Rails app. That didn't scale. They switched from Java to Scala because the Java system became too monolithic. They also bought into Akka and the whole Reactive platform idea. Architecture is terms of hundreds of microservices. Microservices keep a wall between unrelated services, reduces complexity, and keeps development friction-free.

Docker ported into Hadoop as benchmarks show SCREAMING FAST performance: "From an OpenStack Cloudy operational time perspective (boot, reboot, delete, snapshot, etc.) docker LXC outperformed KVM ranging from 1.09x (delete) to 49x (reboot)," Russell wrote. "Based on the compute node resource usage metrics during the serial VM packing test: Docker LXC CPU growth is approximately 26x lower than KVM. On this surface this indicates a 26x density potential increase from a CPU point of view using docker LXC vs a traditional hypervisor. Docker LXC memory growth is approximately 3x lower than KVM. On the surface this indicates a 3x density potential increase from a memory point of view using docker LXC vs a traditional hypervisor."

Evernote has quite the automated build system. Uses Jenkins and 31 machines to build for a bewildering matrix of operating systems, SDKs, and build tools. Man, it's a complex world these days.

Truly interesting insight into how Comcast is Building a large scale CDN with Apache Traffic Server by Jan van Doorn. Instead of buying someone else's system they are building their own and it looks like they are on a good path. Numbers are huge: ~250 servers, ~1.5Pb of data/day, ~5Pb of storage capacity.

In the olden days it was said when you have data in memory and you avoid page faults and cache stalls, CPU is basically free. Times haven't changed. Searching 20 GB/sec: Systems Engineering Before Algorithms: This article describes how we met that challenge using an “old school”, brute-force approach, by eliminating layers and avoiding complex data structures. There are lessons here that you can apply to your own engineering challenges. < A lively discussion on HN.

What is it about Bitcoin's Block Chain that inspires such reverence? lya Grigorik writes an epic article trying to explain it all: Minimum Viable Block Chain: The combination of all of the above rules and infrastructure provides a decentralized, peer-to-peer block chain for achieving distributed consensus of ordering of signed transactions. That's a mouthful, I know, but it's also an ingenious solution to a very hard problem. The individual pieces of the block-chain (accounting, cryptography, networking, proof-of-work), are not new, but the emergent properties of the system when all of them are combined are pretty remarkable.

The stack DNSimple uses to run in five different datacenters: Monitoring: Sensu, Pingdom, NewRelic, HipChat, PagerDuty; Logging: Papertrail, LogEntries; Metrics: NewRelic, Librato; Documentation: GitHub; Deployment: Chef.

Another epic article, this time on why we can't give great Internet. Observations of an Internet Middleman is written by Mark Taylor in a clean and unrelenting style. Step by step we learn directly from Level 3 that the reason your packets may be delayed or dropped as they transit the Internet is that some companies are just too cheap to add more bandwidth. Why? Because they don't have to.

socketcluster: SocketCluster is a WebSocket server cluster (with HTTP long-polling fallback) based on engine.io. 100k messages/second on a 8-core EC2 m3.2xlarge.

It's Haskell. In Space! Architecture of a Real World Haskell Application. Good discussion on HN.

Microsoft as a viable platform for startups. 4 Reasons Why we Love BizSpark. The BizSpark program saves some bucks. Let's you use the tech you want to use. Gives you a credit allowance on Azure. Good community. Good training. Good discussion on HN.

GeoMesa: Scaling up Geospatial Analysis: GeoMesa is an open-source, LocationTech project that manages big geo-time data within the Accumulo key-value data store so that those data can be indexed and queried at scale effectively.

A 1000 users upload a 700 kB picture every second, how many machines do I need to handle the load? 42. Some good discussion and thought process. And then there's the always entertaining stark.

Sorry, it's about Go again. Hitting the million requests/second: But Go, by far, is the easiest multiprocessor enabled language to learn (and master within a few weeks!!!)...All requests are served within max. 8 ms, average 0.8 ms. This is unbeatable! The C++ framework, showing double performance, showed a 134 ms maximum latency.

Gene Kim with a cacophony of tweets from Monitorama Portland and Day 2. It's almost as if he was observing the whole thing from afar.

Nice write up. Scaling Static Sites to 2,000 Concurrent Connections (or How to Survive Hacker News): In this blog post, I will not claim to solve the C10k problem. Instead, I will show you how to support 2,000 concurrent connections on a static site by tuning Linux (Ubuntu 14.04) and Apache (2.4.7). Armed with a $20 a month DigitalOcean droplet, I was able to maintain 2,000 concurrent connections and 1,900 hits per second on this blog. Over an entire day, this performance translates to almost 165,000,000 total requests—more than enough to deal with seasonal load or a popular article on Hacker News.

Jepsen II: Linearizable Boogaloo. I'll reuse Bill Smith's excellent comment: This is a good talk on databases, the CAP theorem, and some techniques for validating (or invalidating) vendor claims about availability and consistency under network partition. The speaker has an intuition for where to look for problems. He has also developed some open source tools, Jepsen and Knossos, to help him.

Read and reread. Still an amazing classic by James Hamilton: On Designing and Deploying Internet-Scale Services: Reducing operations costs and improving service reliability for a high scale internet service starts with writing the service to be operations-friendly. In this document we define operations-friendly and summarize best practices in service design, development, deployment, and operation from engineers working on high-scale services.

Live by the Q, die by the Q. Problem with queuing servers: Our RabbitMQ had a big queue which was not emptying properly. This backfired on our application servers that were trying to reach the queueing servers for major or minor jobs with no luck. To mitigate the risk of affecting important events and notifications such as billing and registrations, we use multiple queues. The queue with the problems was the one sending minor notifications, such as when a user joins a team.

BareMetal-OS: BareMetal is a 64-bit OS for x86-64 based computers. The OS is written entirely in Assembly while applications can be written in Assembly or C/C++.

When does a physical system compute?: There has been, however, no consensus on how to tell if a given physical system is acting as a computer or not; leading to confusion over novel computational devices, and even claims that every physical event is a computation. In this paper we introduce a formal framework that can be used to determine whether or not a physical system is performing a computation. We demonstrate how the abstract computational level interacts with the physical device level, drawing the comparison with the use of mathematical models to represent physical objects in experimental science.

Greg Linden with More Quick Links, asking a question close to my stomach, why don't more companies provide free food? Larger question, should companies separate themselves from their community?

Stuff The Internet Says On Scalability For May 9th, 2014

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale