Stuff The Internet Says On Scalability For September 20, 2013

Hey, it's HighScalability time:

  • $21 Billion: Google spend on datacenters
  • Quotable Quotes:
    • @lintool: Jeff Dean at XLDB: Largest Google Bigtable cluster: ~100s PB data; sustained: 30M ops/sec; 100+ GB/s I/O (gmail?)
    • @neil_conway: On a quick skim, the most surprising thing about the F1 paper is how conventional of an MPP database it is.
    • @Kufat: PSA: 1024 bytes = 1 KB. If someone says it's "1 KiB," they are a Cylon, replicant, or shapeshifter, and must be destroyed forthwith.
    • @mikeleeorg: "If you only do things where you know the answer in advance, your company goes away." - Jeff Bezos
    • @andybritcliffe: Keynotes are always good for stats. Amazon are deloying software updates every 16 seconds #awssummit #impressive
    • @adrianco: last March numbers on US fixed access traffic was Netflix 32.25% and Youtube 17.11%, only takes two to get to 49.36%
    • @cpurdy: OH: "Objective-C is the lack of a type system of C smushed together with the s****y performance of Smalltalk."

  • A Focus on Efficiency. A 70 page paper on how Internet.org will accomplish its goal of providing Internet access to the 5 billion people who don't currently have it. The paper is in two parts. The first part describes how Facebook will essentially become the platform supporting these new users. The idea is to create a 10x improvement reduction in the underlying costs of delivering data, and a 10x reduction of data usage by apps. The strategies are basically taken out of Facebook's playbook, so the paper is an excellent guide to all Facebook has developed over the years. The second part of the paper is Qualcomm presenting an overview of their plan to expand global wireless capacity by 1000 times. It's an audacious gambit to be sure.

  • Another step in the Google vs Oracle divorce proceedings. Google Waves Goodbye To MySQL In Favor Of MariaDB. Using Golang over Java was another important step. It takes time for a supertanker to change direction, but it can be done.

  • If you like ideas made pictures then you'll like Do you know Cassandra? For a comic the humor is so subtle I didn't get it, but it did talk about quorums and stuff.

  • Scaling MongoDB at Mailbox: The decrease in the percentage of time the write lock was held was far better than the linear (50%) improvement we had expected based on our MongoDB profiling.  

  • Greg geeks with another great set of Quick Links.

  • If you can know a person by what tools are on their tool belt then what can you learn about a company by the tools they use? A crash course in LinkedIn's global site operations. LinkedIn uses IRC for chat, BlueJeans for video conferencing; inFormed to see feeds of what's going on in the site; inGraphs to turn data into pictures; Change Request Tracker to see the status of all checked-in code, monitor builds, and deploy new code through our continuous deployment system;  manifest app  to help manage thousands of servers across multiple data centers; JIRA for issue tracking. Plus employees are encouraged to write tools to make life easier. 

  • Ruby on Rails, still scaling at Envato after all these years. 70 million page requests on their Rails stack in one week while averaging a 117 millisecond response times. Recommends keeping your stack simple amd focus on back-end performance.  RoR is also agile, last year they deployed their app 753 times. Keys are no Devs Ops divide, developers are on-call for support, huge automated test suite, users kick up a fuss when things go wrong, and lots of monitoring. 

  • These kind of posts can be tedious, but there are a lot of great lessons in Postmortem of a Venture-backed Startup. Removing friction from existing user behaviors (e.g. checkins) almost always has a higher ROI than building castles in the sky; Growth is the only thing that matters if you are building a social network. Period; Events are for research, business development, and hiring; NOT for getting to 10,000,000 downloads; and many more.

  • Distributed systems for fun and profit. Quite a nice exploration of the topic by Mikito Takada. Distributed programming is about dealing with the implications of two consequences of distribution: that information travels at the speed of light; that independent things fail independently*.  There are five chapters: Basics; Up and down the level of abstraction; Time and order; Replication: preventing divergence; Replication: accepting divergence.

  • Apache Samza: LinkedIn's Real-time Stream Processing Framework:  Samza is LinkedIn's stream processing framework. It is now an incubator project with the Apache Software Foundation. Samza helps you build applications that process feeds of messages—update databases, compute counts and other aggregations, transform messages, and a lot more.

  • Managing multicore memory: Daniel Sanchez, an assistant professor in MIT’s Department of Electrical Engineering and Computer Science, believes that it’s time to turn cache management over to software. This week, at the International Conference on Parallel Architectures and Compilation Techniques, Sanchez and his student Nathan Beckmann presented a new system, dubbed Jigsaw, that monitors the computations being performed by a multicore chip and manages cache memory accordingly.

  • It's like magic. A Beginner's Guide to Perceived Performance: 4 Ways to Make Your Mobile Site Feel Like a Native App: it's not about how fast your site is; it's about how fast your users think it is. The four strategies are: Add Touch States to Your Buttons; Use Momentum Scrolling; Create Performant Animations; Take Advantage of Natural Gestures. Kyle Peatt with some great tips here.

  • Benchmarking Redis on AWS ElastiCache. Medium found m1-medium and m1-large, and m2-xlarge instances are the best value. Though a virtualized environment is not considered the best way to get the best performance out of Redis.

  • On the Apple M7 Motion Processor. Steve Cheney on the beauty of dedicated low power processors to process sensor data in M7's case or for many functions in Moto X's case. "Having an always on co-processor that consumes a tiny amount of power is a trend in mobile (the Moto X does this with other tasks too). It’s better to put the big chip (A7) in deep sleep and keep the little one going." It's a very powerful model for applications as the complex computed data is always available, for little cost.

  • If you don't feel this way at times about every language you use you aren't pushing it hard enough. I will never use Python in production ever again: This quote is from an engineer I respect very much. The reasoning behind it was they hit a few SyntaxError and AttributeError exceptions in their production jobs and had to do a whole new release to fix it, which took a lot of time. The same engineer followed that story by saying that Python is also slow, so to them, overall, Python totally isn't worth using ever again.

  • Database research has an uncanny way of eventually making its way into production. So paying attention to the XLDB CONFERENCE AT STANFORD – QUOTABLE QUOTES is like paying attention to the future.

  • Benchmark: RDS with Provisioned IOPS. Celingest came up with some advice you may find useful when using RDS. The use of Reserved IOPS in RDS is mandatory if we need constant performances, maintaining both latency and throughput fixed; RDS Instances without Reserved IOPS performances are lower than a 1000 IOPS one even if sometimes capable of an higher throughput during insert operations; Databases requiring frequent Scan operations will take most advantages from scaling RDS Instance to a more CPU powerful one than raising IOPS.

  • Clear explanation of why Video isn’t breaking the internet: The problem is that video delivery on the web is fragmented. A Netflix video starts out in Amazon’s cloud but might be delivered via Level 3, Akamai or Cogent to an ISP’s network, or it may be cached at an ISP’s data center or on a Netflix Open Connect box. Once the movie stream is at the last mile, it must traverse an ISP before hitting your home and what may be a flakey Wi-Fi network or merely a congested pipe to the house could cause further problems.

  • No Linux required. It's about performance. OSv, probably the best OS for cloud workloads!: designed from the ground up to execute a single application on top of a hypervisor, resulting in superior performance and effortless management. And just so you know, gazillions of very complex embedded systems have run as one program. It just takes something we like to call skill to make work. In fact, it seems very RTOS like. Except for the Java part. And it will only be available in 2015.

  • Feedly: allows you to build newsfeed and notification systems using Cassandra and/or Redis. Looks like a great start to a problem that is simple until it actually needs to scale. They also include a bunch of links on how different companies like Twitter and Etsy build their feed systems.

  • Rob Pike on why C++ programmers don't come to Go. A few commenters clearly make the point that C++ is used when performance and predictable resource usage are important.  Sometimes it's not all that complicated.

  • Boiler DB: A plugin-based, Redis-inspired, in-memory Key-Value Database: The idea here was to actually sort of extend redis. People often ask "can redis do X" and get the answer that their use case is not broad enough, or it can be done with Lua, or simply no. then they ask what about plugins, and antirez is not inclined to add them anytime, probably rightfully so.

  • Beyond TrueTime: Using AugmentedTime for Improving Spanner: We propose the use of AugmentedTime (AT), which combines the best of TT-based wallclock ordering with causality-based ordering in asynchronous distributed systems. We show that the size of AT can be kept small and AT can be added to Spanner in a backward-compatible fashion, and as such, AT can be used in lieu of (or in addition to) TT in Spanner for timestamping and querying data efficiently.

  • Naiad: A Timely Dataflow System: Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations. Although existing systems offer some of these features, applications that require all three have relied on multiple platforms, at the expense of efficiency, maintainability, and simplicity. Naiad resolves the complexities of combining these features in one framework.

  • Consistency Without Borders: we agitate for the technical community to shift its attention to approaches that lie between the extremes of I/O-level and application-level consistency. We ground our discussion in early work in the area, including our own experiences building programmer tools and languages that help developers guarantee distributed consistency at the application level.

  • Jigsaw: Scalable Software-Defined Caches: a technique that jointly addresses the scalability and interference problems of shared caches. Hardware lets software define shares, collections of cache bank partitions that act as virtual caches, and map data to shares. Shares give software full control over both data placement and capacity allocation. Jigsaw implements efficient hardware support for share management, monitoring, and adaptation.