Stuff The Internet Says On Scalability For September 13, 2013

Hey, it's HighScalability time (this week is a fall harvest basket overflowing with good nutritious wisdom):


(Voyager: 36 yrs & 11B miles has reached interstellar space. So it begins...)

  • 170 million: metrics Twitter collects every minute; 350 million: Snapchat daily photo shares
  • Quotable Quotes:
    • @blowmage: OH: “Guys, databases don't know how to sort things. That's why NoSQL uses JavaScript.”
    • Nokia insider: I look back and I think Nokia was just a very big company that started to maintain its position more than innovate for new opportunities.
    • Paulo Siqueira: Ignoring scalability is not as bad as it sounds—if you use the proper tools.
    • David Rosenthal: The relationship between diversity and risk is very complex.
    • Jaime Teevan: the exact same result list will seem more relevant to you if it is returned just a fraction of a second faster.
    • @aphyr: I use Redis as a queue #leaveDBalone

  • Hey, I've always thought this about security, it's just too damn complex. But I always thought much smarter people than me understood it all, so it must be OK. Now I know that's not true. Nobody really understands security. NIST's Ridiculous Non-Response Response To Revelation That NSA Controlled Crypto Standards Process: "the NSA made sure that the standards were so complicated that no one could actually vet the security."

  • Observability at Twitter. Great post on how Twitter captures, stores, queries, visualizes and automates monitoring data to enable the debugging of hundreds of distributed systems across multiple datacenters. Very detailed and helpful. 170 million individual metrics are collected every minute. Metrics are stored in and queried from a time series database developed at Twitter. Queries are written using a declarative, functional inspired language. Visualization use cases include hundreds of charts per dashboard and thousands of data points per chart. Our monitoring system allows users to define alert conditions and notifications in the same query language they use for ad hoc queries and building dashboards. Note how everything at this level of game play is a custom job.

  • This one will make you cry...Demand for discounted onions crashes Groupon's Indian website. "We wanted to sell it at a price that most of us have completely forgotten." Which is great, but make sure to add more layers to your website first. Onions are pretty easy to grow, even in pots, so that might be an option.

  • Fraud alert! IOPS Are A Scam says Jeremiah Peschka. Contextless metrics are useless. Measurements are usually made with too small blocksizes. What about latency? What about throughput? You want deterministic latency and throughput. Is that what you are getting?

  • Everything you do is implicitly a nudge. So why not nudge with some awareness? How Mobile Experiences Can Shape Our Perception Through Illusions Of Speed: The use or absence of spinners or status bars can have a big impact on how fast things seem to be proceeding. So they removed or played down the spinners and the user’s perception of speed improved.

  • Should you try for moral perfection or just get on with your life and do the best you can? Philosophy is a very long discussion on this subject. In software, for at least a small class of applications, the consensus is "SHIPPING BEATS PERFECTION"

  • Software is the RNA of biology. Peter M. Hoffmann: RNA is an information carrier, like DNA, but because of RNA’s greater flexibility, it can assume complex three-dimensional shapes, which can catalyze reactions like proteins. For this reason, many people believe that RNA came first— before DNA and proteins. This makes the ribosome, an RNA-based machine, especially interesting. Every living being on earth has ribosomes, and they all contain RNA. Every human cell contains about 100 million ribosomes, and tiny bacteria, like E. coli, contain about 10,000 of them.

  • Blast chiller level of a cool idea, BigData applied to protocols. TCP ex Machina: Computer-Generated Congestion Control: Congestion-control schemes each embody implicit assumptions and preconceptions about the network, but we don't quite know what they are. And the teleology of TCP — the end goal that these algorithms try to achieve — is unknown in general, when connections start and stop over time. Remy creates an algorithm to run at the endpoints — possibly a complex one, but with simple emergent behavior: achieving the goal as best as possible on networks described by the stated assumptions. < Incredibly well written. Great example of how to write something other people are supposed to understand.

  • In the every problem can be solved with another level of indirection line of thought there's Racket: Metaprogramming Time! Though I don't think this is quite enough to escape the ontological tar pit.

  • Scaling buddycloud with Fanout.io: To scale up to really large numbers of connections it becomes necessary to distribute the connection pooling across multiple servers. Route the pushes through the Fanout CDN. Fanout uses a concept called “GRIP” proxying.

  • DevOps Cafe Episode 41. How Salesforce went from a traditional enterprise IT to an agile organization. Challenge is delivering transactional enterprise at webscale. Key idea is to not have projects internally. You have service that last for the length of the organization and are continually updated. Most interesting problem is how do you take an entrenched dysfunctional organization and change its direction? Should you split it in two? Make an old and new? Struck me as similar to Microsoft's problem of should they become two companies or not. This problem of how to change a horse into a race car while you are riding it is why nature has decided on extinction as the best decider mechanism.

  • How We Did It: Millions of Daily Pageviews, 60% Less Server Resources. Nothing dramatic. Changes anyone can make. Move from Apache to Nginx. Nginx loads all config files in at launch. PHP-FPM instead of mod_php. Added opcate cache to PHP. Use siege and ab to test traffic spikes. Under test "the Nginx instance performed 100% more transactions, with 50% less response time per transaction." Setup auto-scale to spin up instances when under load. They also had a clever tactic for a hitless switchover to Nginx from Apache. The result: "We have now been serving millions of pageviews and API calls off of two Nginx instances with 0% downtime for a solid three weeks now. Sorry Apache, there’s no looking back."

  • Scaling Play! to Thousands of Concurrent Requests. Using the right tools keeps you on track. Using Scala it's easy to cache immutable responses. "After caching, the throughput went up to 800 requests per second. That's an improvement of more than 4x for less than two lines of code." Play keeps session data in a browser cookie so server memory doesn't grow. (which sucks in other ways). Akka is used for async processing so logic is encapsulated in Actors and it's easy to run concurrently. One commenter suggests setting up a reverse proxy there (Varnish/Squid) and forgetting about caching on JVM.

  • Nitay Joffe, Data Infrastructure Engineer at Facebook, with a great slide deck on Scaling Apache Giraph. Based on Pregel, BSP model with think like a vertex API, it's not slow, it's Neo4J, it's not a message passing system. Hive uses too much disk, has limited caching, and each iteration is a MapReduce job.

  • Scaling Buffer in 2013.  Good overview with lots of details. Front-end is PHP, Backbone.js, Django. Looks like an AWS shop on the back-end. Really like elastic beanstalk. Commenters bring up the point that using  m1.small servers is not efficient. API driven. SQS is used to process events. For the database they are happy with MongoHQ. Built out their own metrics tools.

  • Making FiftyThree. Load Testing an Unexpected Journey. Great write up on tuning a Heroku, Node.js, Neo4j, HAProxy, EC2 based service. Nagios and Datadog were used for alerting.  Nodetime and New Relic for Node performance metrics.  Loggly and Papertrail for logs. Siege and httperf for traffic generation. Blitz.io and Loader.io for distributed load testing. At the end of the process they were getting 10,000 reqs/sec with sub-100ms response times.

  • Azure gets a new distributed cache service, schedule based auto-scaling, web server logging, and new operation log filtering options. Scott Guthrie with an excellent introduction.  

  • Spotify has open sourced sparkey, a key-value store: In uncompressed mode (which is the fastest mode) with 100 million entries we benchmarked it to having an insertion throughput of 1.2 million inserts per second and 500 thousand random lookups per second.

  • Scaling with Queues. Wingify explains how they used a queue based pipeline to process a lot of web beacons. Ended up using Redis to store data quickly and then RabbitMQ as a data broker. Beacons are processed by OpenResty, Lua code writes to Redis, then an agent pulls data out of Redis and into RabbitMQ.

  • Luke Wroblewski with an excellent summary of talk given by Tim Kadlec: Smashing Conf: Deliberate Performance. General idea is the web is suffering from a lack of focus on performance. That's not respectful of people's data plans. There's lots more on the topic as well as a section on performance tools.

  • ZooKeeper vs. Doozer vs. Etcd. Devo.ps found etcd was the best way to distribute cluster configuration information reliably. It was easy to deploy, secure, persisted data, and had good doc, but it's a young project yet. Good thread on Hacker News.

  • Custom hardware is how you create something new and different. Steve Cheney: One of the biggest—if not the biggest—advantages Apple has in not being reliant on merchant silicon (they don’t buy standard application processors designed by others) is that they can customize the A7/A8 etc to exactly fit their own apps / services frameworks, without making generic design compromises.

  • Groupon has open sourced their messaging platform.

  • "Preservation at Scale" at iPRES2013: Who benefits from the economies of scale, the customer or the supplier? Our research shows that, at least in cloud services, Amazon does. We're not alone in this conclusion, Cade Metz in Wired reports on the many startups figuring this out, and Jack Clark at The Register reports that even cloud providers concede the point. Amazon's edge is that they price against their customer's marginal or peak cost, whereas they aggregate a large number of peaks into a steady base load that determines their costs. Peak pricing and base-load costs mean killer margins. 

  • Permanent Web Publishing: LOCKSS (Lots Of Copies Keep Stuff Safe) is a prototype of a system to preserve access to scientific journals published on the Web. It is a majority-voting fault-tolerant system that, unlike normal systems, has far more replicas than would be required just to survive the anticipated failures. We are exploring techniques that exploit the surplus of replicas to permit a much looser form of coordination between them than conventional fault-tolerant technology would require.

  • Stronger Semantics for Low-Latency Geo-Replicated Storage: The primary contributions of this work are enabling scalable causal consistency for the complex column family data model, as well as novel, non-blocking algorithms for both read-only and write-only transactions.

  • F1: A Distributed SQL Database That Scales: F1 is a distributed relational database system built at Google to support the AdWords business. F1 is a hybrid database that combines high availability, the scalability of NoSQL systems like Bigtable, and the consistency and usability of traditional SQL databases. F1 is built on Spanner, which provides synchronous cross-datacenter replication and strong consistency.