Stuff The Internet Says On Scalability For July 18th, 2014

Hey, it's HighScalability time:

The one is many. Lichen composed of possibly 400 distinct species.

  • Quotable Quotes:
    • The Master Switch: Selling radio sets—the old revenue model—was a good if limited business, for ultimately few households would need more than one radio every few years. But advertising revenues could expand indefinitely—or so it seemed then.
    • Larry Page: It’s pretty difficult to solve big problems in four years. I think it’s probably pretty easy to do it in 20 years. I think our whole system is setup in a way that makes it difficult for leaders of really big companies.

  • The Master Switch: The inventors we remember are significant not so much as inventors, but as founders of “disruptive” industries, ones that shake up the technological status quo. Through circumstance or luck, they are exactly at the right distance both to imagine the future and to create an independent industry to exploit it.

  • You thought you were clever and safe? Reality doesn't like that. The fallacy of distributed transactions: In other words, as far as the queue is concerned, the transaction committed, and the message is gone. As far as the database is concerned, that transaction was rolled back, and never happened. Of course, the chance that something like that can happen in one of your systems? Probably one in a million.

  • How do you make Cassandra 50% faster? You add batching of replies. You fix your thread pools. And you get rid of unecessary endcoding and decoding phases like with Thrift. Of course now the bottleneck has moved and the process starts again.

  • Everyday Algorithms: Elevator Allocation. Though I'm pretty sure my contextualized and personalized elevator scheduling goes something like: for him, slow as possible. And they and the crosswalk lights must be in cahoots, because I get the same fine service.

  • Simon Wardley: Anyhow, this is what I don't get. Micro services has become a big thing - good. So, why do we have to continuously create 'new' terms to describe what is already happening? < Why have new songs when they use all the same words? What changes is the context. A new binding requires a new word. It's like a linguistic method of versioning. 10 years ago all the words around that eras version of microservices would be different, so we need a new word now to reflect a new world.

  • Programmers really want the database to work as a queue. It just never works in the end. But if you are Antirez and are a programmer and have your own database then you can make that happen. Queues and databases

  • Another case of breaking one thing in to two parts and then arguing which part is more important. In reference to the debate over Linus' quote on data structures: "Bad programmers worry about the code. Good programmers worry about data structures and their relationships." < Good and evil. Light and dark. Mind and Body. Starsky and Hutch. They are defined in terms of each other and make no sense without each other. They are a single system. 

  • New russian 8-core CPU. It may not be the fastest CPU, but it won't break in the field and hardly ever jams when covered in mud or sand.

  • Gnip talks about how using lower level tools like Redis requires a culture of use to be developed or anarchy will ensue. Enriching With Redis Part II: Schema Happens. Part of the making of a society of code is creating a schema. Irony? A little. But sill interesting.

  • Reducing network traffic in your FPS game. Network Traffic Culling: In our game position and rotation updates of players make up over 90% of the network messages. Every player sends 10 – 20 updates per second per default. However, when an enemy player is far away we don’t care about quick updates, since inaccuracy is harder to notice. We only need near real-time updates for players that are very close by. Since only ever care about the latest position updates so culling a few messages won’t hurt the game. 

  • Going beyond Lambda. Dropping Hadoop. If you need to process real-time data streams then LinkedIn'sSamza might be interesting.

  • How to Back Up Terabytes of Databases. The same old won't work. Brent Ozar suggests: backup as infrequently as you can; change the database as little as possible; tune read and write speeds; compress the data as much as possible; tune the network; tune the backup software settings. Or just use SAN snapshots.

  • A very detailed look at how dropbox syncs files and how their file system works. Streaming sync overlaps the upload and download phases of sync. Up to a 2x improvement on multi-client and large file syncs.

  • Great article on a difficult topic: Practical VPC Design

  • You need to shard, you want to shard, but aren't sure about how to go about doing it? Great details on Managing shards of MySQL databases with MySQL Fabric.

  • Very good explanation of Load Balancing with HAProxy

  • Anatomy of a system call, part 1: System calls differ from regular function calls because the code being called is in the kernel. Special instructions are needed to make the processor perform a transition to ring 0 (privileged mode). In addition, the kernel code being invoked is identified by a syscall number, rather than by a function address.

  • Why Probabilistic Programming Matters. Great overview by mango_man: The idea, in a nutshell: create a programming language where random functions are elementary primitives. The point of a program in such a language isn't to execute the code (although we can!), but to define a probability distribution over execution traces of the program. So you use a probabilistic program to model some probabilistic generative process. The runtime or compiler of the language knows something about the statistics behind the random variables in the program (keeping track of likelihoods behind the scenes).

  • The StorageMojo take: by reducing I/O overhead qNVRAM shows significant gains in performance – and presumably battery life – can be achieved at little cost. It also simplifies the problem of extending flash endurance which may have important knock-on effects – such as enabling wider use of three-level cells.
  • WhitePages Rebuilds Core Parts of Application Stack with Scala and Akka to Improve Scaling. OK, it's a white paper, but the logic looks sound and worth considering.  A 15x reduction in server count; 2 orders of magnitude latency improvement; improved testing and productivity.

  • The Minitransaction: An Alternative to Multi-Paxos and Raft: With the minitransaction, we can spread reads and writes across many nodes, so the system can scale to handle large datasets and workloads. Needing only a simple majority of replicas, we can tolerate software crashes, server failures and network disconnects. Finally, since we use single-decree Paxos, there’s no need for a leader and thus no hiccups from leader fail-over. This is the approach we adopted in TreodeDB so that it has no single points of failure, no scaling bottlenecks and no masters to fail-over. The minitransaction makes TreodeDB reliable and scalable.

  • Bringing Arbitrary Compute to Authoritative Data: The goal of this article is to describe a general-purpose, distributed storage system that supports arbitrary computation on data at rest.