Stuff The Internet Says On Scalability For September 25th, 2015

Hey, it's HighScalability time:


How long would you have lasted? Loved The Martian. Can't wait for the game, movie, and little potato action figures. Me, I would have died on the first level.

  • 60 miles: new record distance for quantum teleportation; 160: size of minimum viable Mars colony; $3 trillion: assets managed by hedge funds; 5.6 million: fingerprints stolen in cyber attack; 400 million: Instagram monthly active users; 27%: increase in conversion rate from mobile pages that are 1 second faster; 12BN: daily Telegram messages; 1800 B.C: oldest beer recipe; 800: meetings booked per day at Facebook; 65: # of neurons it takes to walk with 6 legs

  • Quotable Quotes:
    • @bigdata: assembling billions of pieces of evidence: Not even the people who write algorithms really know how they work
    • @zarawesome: "This is the most baller power move a billionaire will pull in this country until Richard Branson finally explodes the moon."
    • @mtnygard: An individual microservice fits in your head, but the interrelationships among them exceeds any human's ability. Automate your awareness.
    • Ben Thompson~ The mistake that lots of BuzzFeed imitators have made is to imitate the BuzzFeed article format when actually what should be imitated from BuzzFeed is the business model. The business model is creating portable content that will live and thrive on all kinds of different platforms. The BuzzFeed article is relatively unsophisticated, it's mostly images and text, and mostly images.
    • Jamshid Mahdavi: The number-one lesson is just be very focused on what you need to do. [Don't] spend time getting distracted by other activities, other technologies, even things in the office, like meetings.
    • @mikeash: Joe’s code has 20 bugs. If Joe fixes 2 bugs per hour for 8 hours, how many bugs does Joe’s code have now?
    • Dan Rayburn: iOS 9 Download Traffic Peaks At 13Tbps Across 70 Telco & University Networks
    • @Opacki: If DynamoDB can avoid further problems, they can be back on track for their five nines SLA in just 60 short years.
    • Do things, tell people: You would not believe how much opportunity is out there for those who do things and tell people. It's how you travel the entreprenurial landscape. You do something interesting and you tell everyone about it. Then you get contacts, business cards, email addresses.
    • Life Before Earth: An extrapolation of the genetic complexity of organisms to earlier times suggests that life began before the Earth was formed. Life may have started from systems with single heritable elements that are functionally equivalent to a nucleotide. 
    • @coalex_gaia: @PatrickMcFadin kicking knowledge:  "Cassandra is not a unicorn that farts rainbows."
    • @adrianco: Skype used to be fully distributed, but nowadays the presence component is centralized services, more consistent, less available
    • @pbailis: An interesting trade-off: imprecise but "intuitive" acronyms like ACID and CAP popularize important concepts but generate endless confusion.
    • @Carnage4Life: Android under Google has changed the world in material ways. YouTube despite success is just another video site
    • @postwait: I feel like a @surgecon lightning talk should be enacting the latest AWS outage through interpretive dance. 
    • @BenBajarin: Gotta be careful with this stuff. A10 will be no less than 14nm. 6 Cores is also needless, more to GPU likely.  
    • @TheUpstartGuy: "A Disruptive approach says we'll do thing different, a Sustaining approach says we'll do things better" - @asymco 

  • Is what Volkswagen did really any different that what happens on benchmarks all the time? Cheating and benchmarks go together like a clear conscience and rationalization. Clever subterfuge is part of the software ethos. There are many many examples. Cars are now software is a slick meme, but that transformation has deep implications. The software culture and the manufacturing culture are radically different.

  • Can we ever trust the fairness of algorithms? Of course not. Humans in relation to their algorithms are now in the position of priests trying to divine the will of god. Computer Scientists Find Bias in Algorithms: Many people believe that an algorithm is just a code, but that view is no longer valid, says Venkatasubramanian. “An algorithm has experiences, just as a person comes into life and has experiences.”

  • Stuff happens, even to the best. But maybe having a significant percentage of the world's services on the same platform is not wise or sustainable. Summary of the Amazon DynamoDB Service Disruption and Related Impacts in the US-East Region.

  • According to patent drawings what does the Internet look like? Noah Veltman has put together a fun list of examples: it's a cloud, or a bean, or a web, or an explosion, or a highway, or maybe a weird lump.

  • It's one of those things that's obvious in retrospect, but Basecamp making it really easy for people to sign up by putting sign up on the front page is worth millions of dollars a year in revenue. How we lost (and found) millions by not A/B testing. Lessons:  A/B test your changes; individual responsibility is great, but potential revenue impacting decisions need to be widely communicated; react aggressively to problems, don't wait.

  • Echoes of Blackberry. GM exec disses the Apple car, calls it a 'gigantic money pit'. History says: watch your back.

  • This is refreshingly sensible. Avoid the monocrop. Find resiliency in diversity. Navy Diversifies Ships' Cyber Systems to Foil Hackers: RHIMES uses slightly different versions of core programming for each physical controller so that a cyber attack can’t disable or take over all shipboard systems in one fell swoop.

  • When I saw a post with the title Why we're leaving Heroku I was expecting the reason was perhaps cost or some similar attribute. The actual reason was quite unexpected. It was a protest over Salesforce endorsing the Cybersecurity Information Sharing Act of 2015 (CISA). It's rare for someone to give up convenience based on principal. Though the cause is just. It appears a lot of companies want to turn their privacy policies into tissue paper.

  • An unconventional but out of the box strategy: iOS 9 will delete apps to make room for upgrade, reinstall them later

  • What shouldn't I open source? GitHub says: "Don't open source anything that represents core business value." Open source everything else you can. 

  • Caching is not always the answer, often the answer is better design. GitHub on Counting Objects: What you want to do instead is caching intermediate steps of the computation, to be able to efficiently answer any kind of query. In this case, we were looking for a system that would not only allow us to serve clones efficiently, but also complex fetches. Caching responses cannot accomplish this.

  • Talk about fingerprinting! Wherever You Go, Your Personal Cloud Of Microbes Follows. Soon we'll see DNA sequencing ASICs on devices as a way of enriching contextual information.

  • Service-Oriented Architecture: Scaling Our [Uber] Codebase As We Grow. Uber has gone from the monolith to a microservice architecture. They like the ability to use different languages and reduced coupling. Apache Thrift is used for cross language messaging. They like the safety of the IDL binding services to use strict contracts.  The goal: for the remainder of 2015 is to get rid of this repo entirely—promoting clear ownership, offering better organizational scalability, and providing more resilience and fault tolerance through our commitment to microservices.

  • Azul works for High Frequency Trading says Aldo Garcia: We use Azul in equities, FX, FICC, across at least 20 applications...I work with apps that have 128GB heaps and sub-millisecond GC pauses. That's as close to magic as anything I've seen. It's an awesome JVM.

  • For software to take the next step it must learn to self-assemble. self-assembly, kinetic networks: Amino acids in the body assemble themselves into proteins, and capsids, the protective shells surrounding viruses, build themselves out of proteins. Biology is full of examples of this kind of thing.

  • Is CAP all that it is capped up to be? Or is delay-sensitivity a better way to thinking about consistency issues? A Critique of the CAP Theorem: CAP is often interpreted as proof that eventually consistent databases have better availability properties than strongly consistent databases; although there is some truth in this, we show that more careful reasoning is required. These problems cast doubt on the utility of CAP as a tool for reasoning about trade-offs in practical systems. As alternative to CAP, we propose a "delay-sensitivity" framework, which analyzes the sensitivity of operation latency to network delay, and which may help practitioners reason about the trade-offs between consistency guarantees and tolerance of network faults.

  • Silicon Valley is Migrating North: the top 50 startups worldwide likely to become the next “unicorns” (billion dollar evaluation). Impressively, not only are half of them in the Bay Area, a third are in San Francisco by itself. (The rest: New York City 8, China 4, E.U. 3, Boston 2, Chicago 2, India 2, Southern California 2, Arlington 1, and Cape Town 1.)

  • Here's how AdRoll built Petabyte-Scale Data Pipelines with Docker, Luigi and Elastic Spot Instances: we use AWS Spot Instances and Auto-Scaling Groups to provide computing resources on a demand basis. Data is stored in AWS Simple Storage Service (S3)...We orchestrate a complex graph of interdependent batch jobs using Luigi...each individual task (batch job) is packaged as a Docker container...The Docker containers encapsulate jobs written in seven different programming languages. Luigi is used to orchestrate a tightly connected graph of about 50 of these jobs, and Quentin and Auto-Scaling Groups allow us to execute the jobs on an elastic fleet of hundreds of the largest EC2 spot instances in a very cost-effective manner...The main benefit of embracing this heterogeneous, bazaar-like approach is that we can safely use the most suitable language, instance type, and distribution pattern for each task.

  • gcc.godbolt.org lets you interactively input C/C++ code and see how it turns into assembly. Very cool.

  • If you are looking for cheaper cloud storage, backup wizards Backblaze are now offering a cloud service. Backblaze B2: The World’s Lowest Cost Cloud Storage. Their API is not S3 compatible. They only operate out of one datacenter. It uses 17+3 Reed Solomon across 20 computers in 20 different locations in their datacenter. And there's no scalable front-end load balancing (from jbeda). So it's perhaps more suitable for backup than mission critical storage (from zdrummond). But it only costs ½ a penny a month per gigabyte. And uploads are free. That's 1/4th the price of S3. It's cheaper than AWS Glacier.

  • Should managers code? Alon Halevy with advice learned from A Decage at Google: No matter how your role evolves, try to never stop coding (or if you cannot code, get involved in code reviews). You don’t necessarily need to be writing code that is on the critical path of your team (in fact, you probably shouldn’t) and you don’t have to be coding quarter after quarter, but you should try to do some coding on a regular basis.

  • How we Built an Infinitely Scalable Malware Detector: We use stateless API servers persisting data to an  Aerospike data store. Individual stateless scanners retrieve items from the  Aerospike queue and process them. The API servers,  Aerospike database, and malware scanners can all be simply scaled by adding nodes as necessary...Everything is in AWS, we proxy the API through  Mashape...NodeJS for the scanners and API implementation...ClamAV for the scanner...Amazon ELB for the load balancer.

  • Can you develop for 3 platforms in 4 Weeks with 3 developers? Apparently you can. Here's how: Rapid Cross-OS Mobile App Development: Lessons Learned. Yes, picking a relatively simple problem is helpful, but the agile process used might be inspirational for your project.

  • Here's the architecture for Open MPI. Most notable is the plug-in architecture. Also, lots of good lessons are discussed.

  • Three Mistakes in Scaling Non-Relational Databases: At first everyone tries to scale things up instead of out.  Sadly that almost always stops working at some point; Pick the right tool; Write and choose good code.

  • The morning paper is having a great series of papers on nature inspired algorithms. Do you like bats? Birds? Ants? Fireflys? Feral animals? Herons? Swamp creaturesCuckoos?

  • On Cassandra. Benedict Elliott Smith: Certainly local node performance is important, but that's very different to single node, since cluster behaviour is very different when replication is involved. However, either way, I simply don't encounter anyone even pushing 100k/s/node, except simple benchmarks we've run ourselves. Most workloads involve more complex data models, where these inefficiencies simply do not have a chance to exhibit. As such we tend to focus on those other areas. We will no doubt reach a point where hardware and improvements in those areas shift the common bottleneck to where this benchmark is, but that's not typical at the moment, from the data I am exposed to.

  • Good design requires counterfactual thinking. It doesn't come naturally. Here's a class: Edge Master Class 2015: A Short Course in Superforecasting, Class IV

  • Looks like an interesting book. Seven Concurrency Models In Seven Weeks: You’ll learn about seven concurrency models: threads and locks, functional programming, separating identity and state, actors, sequential processes, data parallelism, and the lambda architecture.

  • Why Twitter Dumped Storm for Homegrown Real-Time Engine: Twitter is beginning to outgrow the performance bounds of Storm and has developed a new approach to real-time stream processing via Heron...they are in production with Heron, they are processing hundreds of topologies, billions of messages, and all in the hundreds of terabytes range. This has led to a 3X reduction in overall resource utilization, he notes, while capturing far greater performance overall.

  • Stan: based on a probabilistic programming language for specifying models in terms of probability distributions. Stan’s modeling language is is portable across all interfaces (PyStan, RStan, CmdStan).

  • Scylla: a new approach to NoSQL data store design, optimized for modern hardware. Scylla runs multiple engines, one per core, each with its own memory, CPU and multi-queue NIC. We can easily reach 1 million CQL operations on a single commodity server. In addition, Scylla targets consistent low latency, under 1 ms, for inserts, deletes, and reads.

  • ReactiveSocket: an application protocol providing Reactive Streams semantics over an asynchronous, binary boundary. It enables the following symmetric interaction models via async message passing over a single connection.

  • Dotted Version Vector Sets: are similar to Version Vectors (Vector Clocks for some), but prevent false conflicts that can occur with Version Vectors. It also has a more complete API and is better suited to distributed databases with a get/put interface (has shown below).

  • @neil_conway: Another provocative statement in Google's Dataflow paper (http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf); We as a field must stop trying to groom unbounded datasets into finite pools of information that eventually become complete, and instead live and breathe under the assumption that we will never know if or when we have seen all of our data, only that new data will arrive, old data may be retracted, and the only way to make this problem tractable is via principled abstractions that allow the practitioner the choice of appropriate tradeoffs along the axes of interest: correctness, latency, and cost.

  • FiloDB: designed to ingest streaming data of various types, including machine, event, and time-series data, and run very fast analytical queries over them. In four-letter acronyms, it is an OLAP solution, not OLTP.

  • Complex behaviors can arise from simple brains: Despite the complexity of walking in a particular way with six legs, it took a brain with the equivalent of just 65 neurons to tackle that goal. The researchers estimate that the traditional approach to programming, which does not capitalize on the mechanics of the brain-body system, would have required a whopping 1014 neurons to effect the same walk.