hot links

Stuff The Internet Says On Scalability For November 8th, 2013

High Scalability

08 Nov 2013 — 6 min read

Hey, it's HighScalability time:

Robot elephant from 1950, which consisted of 9000 parts and could walk 27 mp/h

Galaxy contains billions of potentially habitable planets. According to economic theory this should put a downward pressure on rents.
Quotable Quotes:
- Brandon Downey: F*ck these guys.
- @IEEEorg: Every second 21.6 people get their first mobile device. Mobile is growing 5 times faster than the human population.
- @littleidea: a distributed system to deploy a distributed systems to deploy a distributed system, bring your own turtles
- @BenedictEvans: Photos shared/day: Facebook - 350m Snapchat - 350m Whatsapp - 400m Instagram: 55m.
- @SciencePorn: Price of 1gb of storage over time: 1981 $300000, 1987 $50000, 1990 $10000, 1994 $1000, 1997 $100, 2000 $10, 2004 $1, 2012 $0.10
- @kellabyte: If I hear “network partitions are rare on even hundred(s) of node clusters” again I’m going to lose my shit. This fallacy needs to die.
- @danielbilling: PT had some great lines. "Logic merely enables one to be wrong with authority" is a particular favourite.
- @aphyr: "Making things implicit in distributed systems is a good way to f*ck yourself"
- @solarce: "Backpressure should be required" #riconwest
- @mrb_bk: "Unbounded queues are AWFUL! They will F*CK YOU UP!!" - @jmhodges

Growth Hacking sounds a bit like cancer, but it's subtly different. In this case it helps explain the bewildering idea that Snapchat has a valuation of $3.5 Billion. The insight behind an ephemeral service is cool. When Facebook is building entire dead cities for pictures nobody will ever see again, the pure honesty of saying those pictures aren't really worth saving is refreshing. It rooted in the high schools, grew by word of mouth, allowed the free expression and creativity, easy to use, mobile first, group oriented, thrill of random rewards, etc. But I think it goes deeper. Kids are by their nature ahistoric and Snapshat is a perfect match for that nature. Adults are all about memory and Facebook perfectly reflects that ethos.

If you are Rackspace you need a competitive point of differentiation with AWS. It has been service. Then there's the open play with OpenStack. Now there's all speed all the time with Performance Cloud Servers: 100 percent data center-grade solid-state disks (SSDs); They incorporate powerful Intel® Xeon® E5 processors, as much as 120 GB of memory, and 40 gigabits per second of highly available network throughput to every host. So it's fast SSD, fast network, and fast processors. I didn't see mention of controlling performance variance to bring down the long tail, but it's clear Rackspace sees the mobile BigData future as requiring speed and that's where they plan to be. More on pricing and benchmarks.

Your car is now an embedded real-time system of great complexity. That should scare the shite out of you. Toyota Acceleration Case: Memory corruption as little as one bit flip can cause a task to die. This can happen by hardware single-event upsets — i.e., bit flip — or via one of the many software bugs, such as buffer overflows and race conditions, we identified in the code. There are tens of millions of combinations of untested task death, any of which could happen in any possible vehicle/software state. Too many to test them all. But vehicle tests we have done in 2005 and 2008 Camrys show that even just the death of Task X by itself can cause loss of throttle control by the driver.

AWS wisdom from vosper:
- Running EMR jobs on cc2.8xlarge machines as spot instances is a great way to get a LOT of computer power very cheaply. Because our jobs are periodic we run both Core and Task as spots and simply retry the job if our machines get terminated. I did a lot of benchmarking and found that a small number of cc2.8xlarge machines out-performs and is cheaper than a large number of lesser instances (and I tried most of the lesser machines). In us-west-2 it's very uncommon to lose our instances, unlike us-east-1 which has major price fluctuations (this is true for all types of spot instance).
- The cr1.8xlarge has fantastic performance, relative to the rest of the AWS machines. It's also very expensive compared to the cost of hardware or a similar solution on another cloud provider. Since we're fully integrated with AWS and don't want to run our own hardware we're sucking up the cost for now, but it's definitely a sore-point in our budget. The cr1.8xlarge is also all-round a better machine than the hi.4xlarge, which has a lot of disk but is pitiful in terms of CPU.

Scryer: Netflix’s Predictive Auto Scaling Engine. More excellent work on becoming a Cloud Native by Netflix. It helps you handle rapid demand spikes, outages, and variable traffic patterns. Don't just over privision or over oscillate. Do it with class and intelligence.

Facebook has published its Hive killer Presto as Open Source. Presto - a Distributed SQL Query Engine for Big Data. Facebook uses Presto for interactive queries against several internal data stores, including their 300PB data warehouse. Over 1,000 Facebook employees use Presto daily to run more than 30,000 queries that in total scan over a petabyte each per day. A big win is it reads directly from HDFS minimizing the ETL phase. It appears SQL will never die.

Good article on Finding the right CDN for our startup. CloudFront: Fast and easy to setup. CloudFlare: Easy, flexible, but not that fast and somewhat unreliable. MaxCDN: Cheap, delivers both in terms of availability and performance. SSL pricing is differentiator. Discussion on Hacker News brings up other options like EdgeCast and Fastly.

Software engineering anyone? I Failed a Twitter Inteview. Anyone?

Java inter-thread messaging implementations performance comparison (Synchronized vs ArrayBlockingQueue vs LinkedTransferQueue vs Disruptor): As you can see from the results the TransferQueue is pretty much there with the Disruptor and even bettering the Disruptor’s performance as the number of consumers goes higher.

Hugh Williams with a well said take on What's Big Data Anyway? It's discovering patterns, finding anomalies and outliers, and summarizing and generalizing - all improving the lot of users.

Michael Bernstein put together an interesting list of papers for a Distributed Systems Archaeology talk he gave at Ricon.

Good discussion on using PHP vs. Node.js vs. Go for a new startup? Go with what you know. Go with what has the features you need. Go with what will make you happy.

Pete Warden on why Why manual memory management can be worse for performance than garbage collection. It can be, especially on such large projects with so many programmers. But I can always fix a manual memory management system. You can't fix a system where you've given up control.

Jakob Jenkov with very good post on Caching Techniques. Sums it all up in one place. He also has lot of other good posts on Software Architecture. Worth a look.

Ry’s Objective-C Tutorial. Double dang good. Though I still think Objective-C is an ugly language.

The Saddest Moment: Every paper on Byzantine fault tolerance introduces a new kind of data consistency. This new type of consistency will have an ostensibly straightforward yet practically inscrutable name like “leap year triple-writer dirty-mirror asynchronous semiconsistency.”

Programming in Forth was a blast. Chuck Moore is amazingly creative. With him expect the unexpected.

Eliminating unexplained traffic jams: Counterintuitively, a car equipped with Horn’s system would also use sensor information about the distance and velocity of the car behind it. A car that stays roughly halfway between those in front of it and behind it won’t have to slow down as precipitously if the car in front of it brakes; but it will also be less likely to pass on any unavoidable disruptions to the car behind it. Since the system looks in both directions at once, Horn describes it as “bilateral control.”

WhiteDB: a lightweight NoSQL database library written in C, operating fully in main memory. There is no server process. Data is read and written directly from/to shared memory, no sockets are used between WhiteDB and the application program.

Your DNA: Library or Thunderdome?: Systems biologist Michael White, writing for Pacific Standard, dismisses the narrative that our genetic material is a “highly sophisticated, finely tuned data storage and processing device.” Instead, he says, it is an apocalyptic wasteland “littered with the rubble of ancient and ongoing battles with hordes of viruses, clone armies of genetic parasites, and zombie genes that should be dead but aren’t.” Arguing towards an ecology of the eukaryote genome, he likens it to an ecosystem full of communities that have grown, preserved, interacted and competed with each other in a complex system of relationships.

zRAM: is a module of the Linux kernel, previously called "compcache". zRAM increases performance by avoiding paging on disk and instead uses a compressed block device in RAM in which paging takes place until it is necessary to use the swap space on the hard disk drive.

Naiad: A Timely Dataflow System: Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations. Although existing systems offer some of these features, applications that require all three have relied on multiple platforms, at the expense of efficiency, maintainability, and simplicity. Naiad resolves the complexities of combining these features in one framework.

Greg Linden with another varied and tasty dish on Geeking with Greg Quick links. Choose from a fine tapas menu of Amazon, security, Apple, Pinterest, education, and much more.

Stuff The Internet Says On Scalability For November 8th, 2013

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale