hot links

Stuff The Internet Says On Scalability For August 9, 2013

High Scalability

09 Aug 2013 — 5 min read

Hey, it's HighScalability time:

25%: Percentage of North American Internet Traffic served by Google
Quotable Quotes:
- Aristotle: We must not expect more precision than the subject-matter admits.
- Bret Victor: Technologies change quickly and minds change slowly. Ideas require people to unlearn what they’ve learned and adopt new ideas. They think what they’ve learned is programming and this new stuff isn’t programming.
- Steven Roberts: Art without engineering is dreaming; engineering without art is calculating.
- @jamesurquhart: “@b6n: @jamesurquhart @krishnan @rUv There is no such thing as traditional infrastructure at web scale.” < My point exactly.
- HackerNewsOnion: Batch is the new realtime.
- @rbranson: "How do we store this at scale?" "Redis on a cr1.8xlarge" "lmao"
- John Carmack: We are fundamentally creativity bound at this point. We need faster iterations.
- ReadWrite: The tech sector as a whole has created more than a trillion dollars in value over the past decade. Yet that value creation is incredibly concentrated. Nearly two-thirds of the increase reaped by investors and employees comes from Apple and Google alone, with the likes of Amazon, Facebook, LinkedIn, eBay, Yandex and Baidu rounding out the list.
Robert Scoble on how fast companies are growing these days compared to the past. Talking with the Glide CEO, makers of a mobile video chat app, he said in two months they grew to 3.5 million users. Robert compared this growth rate to Twitter's, which when it came out had 13,000 users in 6 months. In six years our expectations and our world have sped up. Reduction of friction has driven faster adoption. OAuth is one friction reduction mechanism because it makes onboarding so much easier and faster.

I was totally suckered in by Bret Victor's The Future of Programming. An artful performance two steps above the usual. When he starts listing the future of programming as the direct manipulation of data; goals and constraints; spatial representations; concurrency, I'm thinking wow, this must be the early 1970s and he's talking about these things and we still don't have them today. We are basically still using Fortran. And that was the point. Mu.

The ultimate data storage system: DNA. A New Approach to Information Storage: In Church's case, a team of researchers used sequencing technology to format his 54,000-word book (with words, images, and a JavaScript program, it came down to 5.27 megabits, or 658.75 bytes) at a density of 5.5 petabytes per cubic millimeter. While the physical volume of 70 billion physical copies of his book would fill nearly 3,500 New York City Public Libraries (including all branches), and a digital version would require somewhere in the neighborhood of 46 storage devices with 1TB drives, all those copies of Church's book fit on a piece of DNA no larger than a speck of dust. What's more, the copies will last hundreds of thousands of years—perhaps even a million years—and do not require any special handling or temperature conditions.

Fighting against his love of performance, René Pickhardt ranks his priorities for Backend Web Programming: Scalability > Maintainable code > performance. Some excellent experience based observations: web scale is not so much about performance of single services or parts of the software but rather about the scalability of the entire architecture; Scalability is more important than performance; A software architecture scales best if it has a lot of independent services; If services need to interact they should be asynchronous and non blocking; to achieve scalable code one needs to include some middle layer for caching and one needs to abstract certain things. The result is an architecture where each service has it’s own data store where time critical and context-aware information are stored and caching is separated by service.

We don't know costs or random read/write or sequential read/write performance yet, but Crossbar’s RRAM technology looks like a RAM cloud in the making.

Counter inuitive: Mobile Web apps may be slow, but with today's Garbage Collectors, they DON'T need more memory.

Jonathan Ellis and Michael Hausenblas mix it up a little in Big Data Debate: Will HBase Dominate NoSQL?. Jonath doesn't pull any punches with a dissection of potential causes of HBase death. Michael takes the high road, which ends up seeming a bit lost.

A stack you may not be familiar with. Mike Broberg tackles the making restauraunt menus accessible using CouchDB, CouchApps, long polling, Lucene, and Node.js. Likes the power, performance, and simplicity.

Queueing in the Linux Network Stack. Awesome article by Dan Siemon. "It aims to explain the different layers where packets can be queued within the Linux network stack and how to control kernel features which impact network latency." Also a shout out to the Linux Journal which has consistently great technical content and should be commended for publishing article like Dan's.

It turns out Google can play the add features game too. Google Compute Engine now offers Layer 3 Load Balancing.

It takes a long deep experience of pain and betrayal to write what Ian Bogost has written in OAUTH OF FEALTY - Resignation beyond sorrow on the Facebook Platform and beyond. All software exacts as a price part of our lives. The ethical thing to do is respect the price paid by trying to make it as small as possible.

It's such a shame Google Reader didn't have any users. Poor Feedly probably didn't get any new users at all. Wait for it: As Google Reader winds down, Feedly just keeps growing: Feedly has doubled its server count to around 100, and the data under management has swelled up to 100 terabytes. The data growth in recent months pushed Feedly to shift from a MySQL and Memcached combination to some Memcached in conjunction with HBase on top of the Hadoop Distributed File System.

Good discussion of the economics using S3 and Glacier for offiste backup of pictures for photographers. Show HN: Raw Image Storage for photographers using S3 and Glacier. I had no idea photographers used so much disk space. Also interesting the problem of selling into markets where you must convince them to use your service. It's just too hard.

Is Mozilla making a mistake using coding the mobile Firefox OS in Javascript? You would think so, but maybe not: Staring at the Sun: Dalvik vs. ASM.js vs. Native. Performance isn't bad compared to Java or native C++.

Think Like a Commander Prototype: Instructor's Guide to Adaptive Thinking: The maxim "train as you fight" has risen to such a level of familiarity in the U.S. Army that the value ofthe notion goes almost unquestioned. Yet studies of the development of expertise clearly indicate that "as you fight" meaning performing in fully realistic simulated battles is neither the most effective nor efficient method of developing expertise. Such "performances" can help a novice become acquainted with applying mihtary knowledge, and can reinforce existing knowledge in an experienced person, but will not in and of themselves lead to the development of expertise.

Ever wondered how caching works in the browser? Petr Kunc has written his thesis on the subject: Framework for Developing Offline HTML5 Applications. This stuff is way too complicated. No wonder caching is so hard to get right. More in a great article by Jake Archibald: Application Cache is a Douchebag

Service-Oriented Data Denormalization: We deploy the resulting service-oriented implementation of TPC-W across an 85-node cluster and show that restructuring its data can provide at least an order of magnitude improvement in the maximum sustainable throughput compared to master-slave database replication, while preserving strong consistency and transactional properties.

SIFT: Design and Analysis of a Fault-Tolerant Computer for Aircraft Control.

Stuff The Internet Says On Scalability For August 9, 2013

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale