hot links

Stuff The Internet Says On Scalability For April 11th, 2014

High Scalability

11 Apr 2014 — 8 min read

Hey, it's HighScalability time:

DNA nanobots deliver drugs in living cockroaches which have as much compute power as a Commodore 64

40,000: # of people it takes to colonize a star system; 600,000: servers vulnerable to heartbleed
Quotable Quotes:
- @laurencetratt: High frequency traders paid $300M to reduce New York <-> Chicago network transfers from 17 -> 13ms.
- @talios: People read http://highscalability.com for sexual arousal - jim webber #cm14
- @viktorklang: LOL RT @giltene 2nd QOTD: @mjpt777 “Some people say ‘Thread affinity is for sissies.’Those people don’t make money.”
- @pbailis: Reminder: eventual consistency is orthogonal to durability and data loss as long as you correctly resolve update conflicts.
- @codinghorror: Converting a low-volume educational Discourse instance from Heroku at ~$152.50/month to Digital Ocean at $10/month.
- @FrancescoC: Scary post on kids who can't tell the diff. between atomicity & eventual consistency architecting bitcoin exchanges
- @jboner: "Protocols are a lot harder to get right than APIs, and most people can't get APIs right" - @daveathomas at #reactconf
- @vitaliyk: “Redundancy is ambiguous because it seems like a waste if nothing unusual happens. Except that something unusual happens—usually.” @nntaleb
- Blazes: Asynchrony * partial failure is hard.
- David Rosenthal: I have long thought that the fundamental challenge facing system architects is to build systems that fail gradually, progressively, and slowly enough for remedial action to be effective, all the while emitting alarming noises to attract attention to impending collapse.
- Brian Wilson: Moral of the story: design for failure and buy the cheapest components you can. :-)

Just damn. DNA nanobots deliver drugs in living cockroaches: Levner and his colleagues at Bar Ilan University in Ramat-Gan, Israel, made the nanobots by exploiting the binding properties of DNA. When it meets a certain kind of protein, DNA unravels into two complementary strands. By creating particular sequences, the strands can be made to unravel on contact with specific molecules – say, those on a diseased cell. When the molecule unravels, out drops the package wrapped inside.

Remember those studies where a guerilla walks through the middle of a basketball game and most people don't notice? Attention blindness. 1000 eyeballs doesn't mean anything will be seen. That's human nature. Heartbleed -- another horrible, horrible, open-source FAIL.

Remember the Content Addressable Web? Your kids won't. The mobile web vs apps is another front on the battle between open and closed systems.

In Public Cloud Instance Pricing Wars - Detailed Context and Analysis Adrian Cockcroft takes a deep stab at making sense of the recent price cuts by Google, Amazon, and Microsoft. AWS users should migrate to the new m3, r3, c3 instances; AWS and Google instance prices are essentially the same for similar specs; Microsoft doesn't have the latest Intel CPUs and isn't pricing against like spec'ed machines; IBM Softlayer pricing is still higher; Moore's law dictates price curves going forward.

Seth Lloyd: Quantum Machine Learning - QM algorithms are a win because they give exponential speedups on BigData problems. The mathematical structure of QM, because a wave can be at two places at once, is that the states of QM systems are in fact vectors in high dimensional vector spaces. The kind of transformations that happen when particles of light bounce of CDs, for example, are linear transformations on these high dimensional vector spaces. Quantum computing is the effort to exploit quantum systems to allow these linear transformations to perform the kind of calculations we want to perform. Or something like that.

Good explanation of how to go beyond the cheap VPS level and build a real infrastructure using Docker: CoreOS and Nuxeo: How We Built nuxeo.io. We are starting to see more and more of these kind of articles on Docker, which means it's starting to exit Innovator and enter the Early Adopter phase.

Another emerging trend is using Apache Mesos as a complex backend job scheduling mechanism. It's being used by Twitter, HubSpot, AirBnB. For a flavor of how it works they have a case studies page, which while obviously is not an objective source of information, it is useful to see how you might want to take the next step of seeing your computer resources as an exploitable aggregate.

At the core of every backend is a messaging infrastructure. Here's how Twitter does it. Netty at Twitter with Finagle: Finagle is our fault tolerant, protocol-agnostic RPC framework built atop Netty. Twitter’s core services are built on Finagle, from backends serving user profile information, Tweets, and timelines to front end API endpoints handling HTTP requests.

Wonderful story of how NoSQL brought down the crypto-currency star. NoSQL Meets Bitcoin and Brings Down Two Exchanges: The Story of Flexcoin and Poloniex: Bitcoin coincided with a particularly dark time in distributed systems when people, armed with an incorrect interpretation of the CAP Theorem, thought that they just had to give up on consistency in their databases, that no one could build distributed data stores that provided strong guarantees.

Looks interesting. Tackle Distribution, High Throughput and Low-Latency with Orleans – A “cloud native” Runtime Built for #Azure: Orleans is a new cloud programming model that was designed for use in the cloud, and that has been used extensively in Microsoft Azure. What Orleans brings to the table is a change in the way we think about Cloud Services as a whole. We can stop thinking about Role instances and all the goop that needs to be built around them in order for us to build our solution. We can start thinking in terms or Actors.

Looks like storage costs are not going to keep falling as fast as they have in the past. What Could Possibly Go Wrong?: Expectations for future storage technologies and costs were built up during three decades of extremely rapid cost per byte decrease. We are now 4 years into a period of much slower cost decrease, but expectations remain unchanged. Some haven't noticed the change, some believe it is temporary and the industry will return to the good old days of 40%/yr Kryder rates. Industry insiders are projecting no more than 20%/yr rates for the rest of the decade. Technological and market forces make it likely that, as usual, they are being optimistic. Lower Kryder rates greatly increase both the cost of long-term storage, and the uncertainty in estimating it. The idea that archived data can live on long-latency, low-bandwidth media is no longer the case. Future archival storage architectures must deliver adequate performance to sustain data-mining as well as low cost. Bundling computation into the storage medium is the way to do this.

LinkedIn on Garbage Collection Optimization for High-Throughput and Low-Latency Java Applications: With these options, our application's 99.9th percentile latency reduced to 60 ms while providing a throughput of thousands of read requests.

Alan Kay's Reading List. A not unexpected variety of good books.

Awesomely detailed look at Apple's Grand Central Dispatch In-Depth: Part 1/2 by Derek Selander. Typical kind of threading stuff, but seeing how to use threading and queues in Apple's world is quite helpful.

As it often the case the downtime was caused by an upgrade and a resource limitation causing requests to bounce. Hosted Enterprise Chef Search API Downtime.

It can be done and mostly works. Pub-Sub messaging with Zookeeper: Since broadcasting messages means creating a zNode for every consumer which is subscribed to the topic, the performance of the publisher degrades linearly with the number of topic subscribers. In a not very accurate test conducted on my local development machine, a single publisher managed to push around 1000 messages per second to a topic with a single subscriber. When increasing the number of subscribers to 10 the message rate went down to ~250/sec. With 20 subscribers it was ~125/sec.

Long and sometimes interesting comment thread on Six programming paradigms that will change how you think about coding. I had no idea when I was programming in Forth I was programming in a concatenative language!

Ivan Pepelnjak - SHOULD WE USE REDUNDANT SUPERVISORS?: Now imagine you replace two humongous core switches with a spine layer having 4 or 8 fixed or modular switches. All of a sudden losing a spine switch doesn’t hurt that much. Welcome to the wonderful world of proper network design ;)

Greg Ferro with a Response: Rate-limiting State and Internet Frailty – ACM. Smart at the edge and dumb in the middle, my favorite kind of cookie.

Speaking of combining CPU with storage. A3Cube’s cluster architecture: The data plane is a dedicated internode network based on a traffic coprocessor – the RONNIEE Express Fabric – designed to provide low latency and high bandwidth for up to 64,000 nodes, without expensive switches, in a sophisticated mesh torus topology. User traffic remains on a front-end network.

qq66: Yes. Facebook's "cost of revenue" (which they state is mostly infrastructure) was $1.875 billion in 2013, a year when they made $1.5 billion in net income. For comparison, research and development was $1.4 billion. Facebook's business model involves getting 1 billion people to post a ton of stuff inside Facebook, costing them about $2/user/year in infrastructure, $3.50/user/year in other costs, and making about $7/user/year in advertising revenue, yielding about $1.50 in profit. So cutting costs on that $2 makes them significantly more profitable.

ServiceWorker: ServiceWorkers are a new feature for the web platform that lets a script persistently cache resources and handle all resource requests for an application -- even when the network isn't available. Putting it all together, ServiceWorkers give you a way to build applications that work offline.

Social Physics: How Good Ideas Spread-The Lessons from a New Science: By analyzing the millions of detailed messages among traders on a social network, we discovered that the effects of social influence within the network were too strong, causing the phenomenon of herding, in which the traders over reacted to each other, and so all tended to adopt the same trading strategy.

Nathan Marz on Storm, Immutability in the Lambda Architecture, Clojure: The core abstraction of Storm is a stream which is just an infinite list of tuples and then tuples are just named lists of values so you have tuples which contain URLs, person identifiers, time stamps, and so on. And Storm is all about transforming streams of data into new streams of data, you do this by defining what we call a topology where there are basically two things that go into a topology: the first is called a spout and a spout is just a source of streams in a topology. So for example we have might have a spout which reads from a Kafka queue and emits that as a stream, then we have bolts, like I was saying before, process input streams and produce new output streams, so you wire together all your spouts and bolts into this network and that will be how things process. It’s pretty typical in Storm to have your bolts talk to a database for whenever you need to keep persistant state, that is actually one of those common applications of Storm, just doing the realtime ETL of consuming a stream and then updating the databases and doing that in a fault tolerant, scalable way.

Pithos: splits object storage across regions, each having one or more storage class. Metadata is stored globally, object data is isolated on a region. Bucket objects may only be stored in a single region.

FaRM: Fast Remote Memory: We used FaRM to build a key-value store and a graph store similar to Facebook's. They both perform well, for example, a 20-machine cluster can perform 160 million key-value lookups per second with a latency of 31µs.

Greg Linden with another interesting set of Quick Links.

Stuff The Internet Says On Scalability For April 11th, 2014

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale