Stuff The Internet Says On Scalability For January 30th, 2015

Hey, it's HighScalability time:


It's a strange world...exotic, gigantic molecules Fit Inside Each Other like Russian nesting dolls

  • 1.39 billion: Facebook Monthly Active Users; $18 billion profit: Apple in 3 months; 200 million: Kik users; 11.2 billion: age of the oldest known solar system; 3 billion: videos viewed per day on Facebook
  • Quotable Quotes:
    • @kevinroose: This dude wins SF bingo. RT @caro: An Uber driver is Airbnb'ing the trunk of his Tesla for $85/night.
    • @BenedictEvans: Only 16% of Facebook DAUs aren't using it on mobile
    • @rezendi: Yo's Law: "in the 21st century tech industry, satire and reality are not merely indistinguishable but actually interchangeable."
    • Brent Ozar: I recommend that people back up data, not servers.
    • @AnnaPawlicka: "Shared State is the Root of All Evil"
    • Peter Lawrey: micro-day - about 1/12 of a second. micro-century - 51.3 minutes. femto-parsec - about 30 metres.
    • TapirLiu: OH: docker is like a condom to protect your computer from Node.
    • @DigitCurator: "The Next Decade In Storage": Resistive RAM promises better scaling, efficiency, and 1000x endurance of flash memory 
    • @BenedictEvans: At the end of 2014 Apple had ~650-675m live iOS devices. With zero unit sales growth, 700-720m by end 2015. Consumer PCs in use - 7-800m
    • @MailChimp: We sent 14.1 billion emails in December, including 741 million on Cyber Monday.
    • @mjpt777:  That's in the past. We can now do 20 million per second :-) per stream.
    • @bradwilsonConclusions: 1. Ethernet over power does not perform as well as WiFi (??) 2. Ethernet over power hates being shared among multiple PCs
    • @mjpt777: Specialized Evolution of the General-Purpose CPU  - note that performance per watt is approx doubling per generation. 
    • @nighitingale: "The Earth is 4.6 billion years old. Scaling to 46 years, humans have been here 4 hours, the industrial..."
    • Joseph Campbell: The hero’s journey always begins with the call. One way or another, a guide must come to say, “Look, you’re in Sleepy Land. Wake. Come on a trip."
    • Frank Herbert: the most persistent principles of the universe were accident and error.

  • Will Facebook ever figure out this mobile thing? Not long ago that was the big question. We have an answer. In the fourth quarter, the percentage of its advertising revenue from mobile devices increased to 69%, up from 66% in the third quarter and 53% a year earlier. Mobile daily active users were 745 million on average for December 2014, an increase of 34 percent year-over-year.

  • The power of smart: Facebook’s Powerful Ad Tools Grew Its Revenue 25X Faster Than User Count. Facebook might be running out of people, but they aren't running out of ways of monetizing those people. Math grows faster than users.

  • The Cathedral of Computation by Ian Bogost. Agree in part. There does seem to be an uncritical acceptance of algorithms, as if because they enliven machines they are some how pure and objective, when the opposite is the case. Algorithms are made for human purposes by teams of humans and show the biases and hubris of their makers. And like all creatures, algorithms should be subject to skepticism, law, and review.

  • We have many long running debates in tech. Server side vs client side rendering is just one of them. A thoughtful analysis: Tradeoffs in server side and client side rendering by Malte Ubl.  Bret Slatkin boldly claims: Experimentally verified: "Why client-side templating is wrong". He concludes: I hope never to render anything server-side ever again. I feel more comfortable in making that choice than ever thanks to all this data. I see rare occasions when server-side rendering could make sense for performance, but I don't expect to encounter many of those situations in the future.

  • Maybe one day we will all live in the eye of Project Maelstrom: The Maelstrom browser can access conventional websites. But it can also be used to publish and browse websites that don’t reside on any particular server, known as torrent Web pages. To access a site published in that way, the browser grabs the data from the browsers of people who are already viewing it or have visited the site recently.

  • Lots of very detailed posts from from the RavenDB Performance team report are up on Ayende @ Rahien. The spirit of the thing: TLDR: Optimizing at this level is really hard. To achieve gains of 20%+ for Compare and from 200% to 6% in Copy (depending on the workload) we will need to dig very deep at the IL level.

  • All becomes clear now, Google really wrote Inbox for themselves! Volatile and Decentralized: Day in the Life of a Google Manager: 9:00am - Catch up on email. This is a continuous struggle and a constant drain on my attention, but lately I've been using Inbox which has helped me to stay afloat. Barely. 

  • Shmoocon (an annual east coast hacker convention) 2015 Videos are now available.  If security is your thing it looks like there's lots of good content waiting for you. Keep it secret. Keep it safe.

  • Databases at Scale Part Three: The Reality of Transactional Apps on The New Stack. Covers lots of different database and discusses important features. 

  • All connections are slow some of the time. Ilya Grigorik asks what are you going to do about it? Plan for variability. Define an acceptable SLA for each network request. Make failure the norm, instead of an exception. 

  • Cool effect. I think I see some hydrogen. A real-time latency spectogram of AWS storage accesses. It's not just about pretty pictures: The moral of this story is: pick more expensive storage options only if you’re sure you need them, and take into account latency outliers that can sometimes slow your own application down for a very long time.

  • Dan Rayburn on unintended consequences. iPhone 6 Display Making Image Delivery Harder For CDNs, Forcing Shift To Responsive Web Design: Ultimately, CDNs will either need to figure out a way to deal with delivering much bigger images efficiently over the last mile or they will struggle with serious performance issues on the iPhone 6 and iPhone 6+ and other new devices coming to market. Their alternative will be deploying more capacity on the edge of the network, which is by and large not a cost-effective strategy.

  • So you didn't walk 100 miles at CES. Well, chetan sharma did and he wrote up a really nice experience report. Big themes: Connected intelligence, IoT, wearables, robots, autonomous cars, 3D printing, connected homes, health care, and more!

  • Netflix's Viewing Data: How We Know Where You Are in House of Cards. Netflix details how they went from a  stateful architecture, consistency centric design that experienced outages to an availability centric, microservice rich design painted in many database colors. 

  • Akka says what the heck, go ahead and cross the streams!

  • Swimming with the sharks can pay off. rubyrescue: A former client was on Shark Tank. The pitch in-studio was 2.5 hours and of course only 7 minutes was aired. The interest it generated (in terms of downloads) was enormous. It nearly filled the Heroku database during the show - which we quickly upgraded after it aired.

  • A worthy goal and interesting means. Nymote: The mission of Nymote is to enable the creation of resilient decentralised systems that incorporate privacy from the ground up, so that users retain control of their networks and data. To achieve this, we reconsider all the old assumptions about how software is created in light of the problems of the modern, networked environment.

  • A little hit of database. The Morning Paper taks a look at Architecture of a Database System – Hellerstein, Stonebraker & Hamilton, 2007.

  • Looks interesting. Orleans - Distributed Actor Model: Orleans is a framework that provides a straight-forward approach to building distributed high-scale computing applications...Orleans has been used extensively running in Microsoft Azure by several Microsoft product groups, most notably by 343 Industries as a platform for all of Halo 4 cloud services

  • In the never ending browser client wars Netflix is adopting Facebook's React, moving away from a hand rolled system. They liked its startup speed, runtime performance, modularity, JavaScript that runs on the server and the client, its use of a virtual DOM, and its support for component composition.

  • A great Data Science AMA on Reddit

  • Interesting look at something I've never considered before: Signing Software at Scale.  Run your own signing servers (plural). Collect passphrases at startup. Don't let just any machine request signed files. Use input redirection and other tricks to work around unfriendly command line tools.  Sign everything you can on Linux (including Windows binaries!). 

  • Nice concise introduction to Column-oriented database: Introduction (Part 1).

  • Lessons Learned – Benchmarking NoSQL on the AWS Cloud (AerospikeDB and Redis). The winner at scale? Not Redis. 

  • Amazon in the enterprise market doesn't do quite as well. ecaron: After trying for 3 weeks to make AWS WorkSpaces work for my company, I can still confidently say that Amazon doesn't get it. Their solutions are more "Go to Home Depot and get the parts to make a desk", whereas Google Apps is "go to Ikea". The two solutions are neighbors, but not everyone is ready to buy the individual boards and cut them down to size.

  • In case you need to do some learnin': Building and deploying large-scale machine learning pipelines.

  • Edward Capriolo has made it to part 9 of his amazing series of posts: Request Routing - Building a NoSQL Store.

  • Irmin - a distributed database that follows the same design principles as Git.

  • Here's one version of the future. Building Massively-Scalable Distributed Systems using Go and Mesos. Good explanation with code examples.

  • NSQ: a realtime distributed messaging platform designed to operate at scale, handling billions of messages per day.

  • Genesis: Tumblr's tool for data center automation.

  • Apache BookKeeper shares details on their replication scheme. An excellent info source if you are working on something similar

  • When you need to access data from all your data sources. Building Out the SeatGeek Data Pipeline using Looker, Redshift, and Luigi.

  • Why do human societies tend to organize in structures where 80 people can own half the world's wealth? I was asked this and since it relates to the deep working of things, here's my stab at an answer: The driving force for why so few control so much is a want for MORE that organizes structure through access to opportunity. Like how raindrops at the top of a mountain organize into rivers as they flow down to the sea. The sea are the people who attract resources because they occupy the sink/well/basin of attraction position in a kind of resource/energy/opportunity/gravity gradient. Rain is all the little opportunities that must flow down a mountain/gradient by their very nature. Rivers are the flowing channels that form as opportunities accumulate on their way to greet the sea. Money is energy and flows of energy organize structure. Energy flows always cause structures to self-organize in such a way as to maximize flow to the sea. The small number of people who control the majority of wealth are both created by the flow and help keep the flow moving. The people change, but the flow is forever.