Stuff The Internet Says On Scalability For July 11th, 2014

Hey, it's HighScalability time:


Yesterday in history:  Nikola Tesla's Birthday, born in 1856. The greatest geek who ever lived?

  • 10Gbps: New world record broadband speed of 10 Gbps over copper.
  • Quotable Quotes:
    • @BenedictEvans: There were 40m internet users when Netscape IPOed. The time's not far off when a startup with 40m users will be too small to get funded.
    • Scott Aaronson: In any case, the question I asked myself about CLEVER/PageRank was not the one that, maybe in retrospect, I should have asked: namely, “how can I leverage the fact that I know the importance of this idea before most people do, in order to make millions of dollars?”
    • chub79: µservices aren't technological as much as they are cultural.
    • @Elmood: I thought of a new term when talking about code: "It's made from unmaintainium."
    • @lxt: Amazing how quickly a bunch of nines go up in smoke.
    • @martinrueKnock knock. Race condition. Who's there?

  • The Master Switch: The Rise and Fall of Information Empires by Tim Wu: History shows a typical progression of information technologies: from somebody’s hobby to somebody’s industry; from jury-rigged contraption to slick production marvel; from a freely accessible channel to one strictly controlled by a single corporation or cartel—from open to closed system. History also shows that whatever has been closed too long is ripe for ingenuity’s assault: in time a closed industry can be opened anew, giving way to all sorts of technical possibilities and expressive uses for the medium before the effort to close the system likewise begins again.

  • Tim Freeman indulges a well developed Technothantos Complex and comes up with a great big list of outage postmortems. You'll find the usual, outages from configuration issues, failover failures, quorumnesia, protocol flapping, bugs in not your stuff that causes bugs in your stuff, power outages, capacity problems, JPOBs (just plain old bugs), DDOS attacks, and good old operator error. 

  • Pinterest describes PinLater, An asynchronous job execution system. PinLater executes hundreds of different job types at a processing rate of over 100,000 per second. So you may say yet another async job system, but it's clear keeping such a critical part of their infrastructure in house makes sense. The article is a good explanation of a fairly standard approach. It used Thrift for the API, it's written in Java, Twitter’s Finagle is used for the RPC framework. MySQL is "used for relatively low throughput use cases and those that schedule jobs over long periods and thus can benefit from storing jobs on disk rather than purely in memory." Redis is "used for high throughput job queues that are normally drained in real time." Horizontal scaling is via sharding

  • In science class we did this one day, but I just couldn't do it. Dissecting Message Queues. Tyler Treat looks at both brokerless and brokered queues by looking a throughput benchmarks, latency benchmarks, and through qualitative analysis. No winner was declared, but if you are making a choice in this area it's well worth reading. 

  • 40 Million hits a day on WordPress using a $10 VPS. Sure, it's a static site, but still a good example of what can be done these days. Stack: Nginx + PHP-FPM (aka LEMP Stack) + Microcaching

  • Usually when you get acquired moving is part of the deal. Here Instagram talks about their experience of migrating out of the AWS datacenters and into Facebook's datacenters. The process took three weeks and required some networking magic to make happen. As you might expect EC2 specific assumptions didn't port over to Facebook. All Instagram-specific software was ported to run inside of a Linux Container (LXC) on the servers in Facebook’s data centers. Not an easy process, but it can be done. 

  • Build your own CDNish. How LinkedIn used PoPs and RUM to make dynamic content download 25% faster. PoPs are small scale data centers (points of presence). RUM is Real User Monitoring. Combining the two you can route users to the closest location for the requested content. They give examples of how they used PoPs in South Asea, Middle East, China, and Pacific RIM. Great stuff.

  • What could lead this professor to such a shocking conclusion? Distributed is not necessarily more scalable than centralized. After some discussion Murat concludes: Distributed is not necessarily more scalable than centralized;  And centralized is not necessarily a scalability bottleneck. As a distributed systems professor, I wouldn't imagine myself defending centralized solutions. But there it is. < Also keep in mind: Google Finds: Centralized Control, Distributed Data Architectures Work Better Than Fully Decentralized Architectures

  • The StorageMojo take: It would help the AFA array market to get away from the pointless IOPS number. It made more sense with disk arrays since disks were the ultimate limiting factor. More disks, more IOPS. More cache, lower (average) latency.

  • Ruby Rogues talk RR Scaling Rails with Steve Corona. Steve at Twitpic went from sleeping beside his laptop to keep the site up to over time making everything automatic. At the time scaling information simply was not available. Twitpic was a nasty PHP app. Wanted to use Rails, but since it used more hardware resources it was hard to switch. PHP induced technical debt caused a Rails Twitpic clone to be created that had a much smaller and maintainable code base. In 2012 the switch was on Rails. Jruby could be faster than PHP. Hard to do because had 50 millions users and 4 billion photos. A ton of data to move. Data migration was the hard part. Good interview.

  • CppCon 2014 videos are now available.

  • HBaseCon 2014 videos are now available.

  • Even as a wide striding full stack developer are there things you shouldn't do yourself? Apparently there are Seven Things You Should Never Code Yourself: parsing HTML, CSV, JSON, XML; email address validation; URL processing; data/time; templating systems; logging. Good discussion on reddit. Have to say I was thinking of things like SSL, but this list is actually a great set of learning problems that everyone should take a shot at coding. Then use someone else's library :-)

  • Baskin-Robbins has more flavors, but Microservices - 72 resources is still a tasty number of varieties.

  • Humans are great at creating things imbued with soul, but humans are so hard to scale. Songza has a music curation-scaling problem — and now Google does, too.

  • Java users here's an great post on Improved FJ thread throttling by Doug Lea: One approach to dealing with this [specifying the number of threads] would be to introduce a zillion controls that would be even harder to use andprone to even more policy inconsistency and context-dependence problems than seen with ThreadPoolExecutor. This would be a throwback to the days when every efficient parallel program had to be custom built. Some people think that people should still write parallel programs this way (please feel free to do so.)  FJ instead implements portable algorithms and internal policies that are rarely optimal forany given platform and application but often close to optimal. 

  • multi-process architectures suck :(. Costs are: context switching, message passing, process creation overhead, 3rd party code is designed for a single core world, lack of tooling. < I look forward to the no cost architecture description.

  • Latency Kills and How You Can Improve It in Google App Engine Applications: cache precalculated entries rather than recalc on each access; use async version of calls to make calls in parallel; render in the browser, not the server; use Google PageSpeed to profile your code; 

  • Scaling a standard Azure website to 380k queries per minute of 163M records with loader.io. Troy Hunt says Azure is in the hunt: people don’t appreciate just how far modern web servers can scale. Of course this is also predicated on there being well-designed apps, but particularly a modern incarnation of IIS running on Azure can scale a hell of a long way and that’s what I’m going to show you here today.

  • The Myth of Schema-less. Gotebe nailed in a comment: schema-less doesn't exist, it's either in your database, or spread throughout the code. 

  • Cool explanation of how a cost-based optimizer decides whether to use an index. Finding All the Red M&Ms: A Story of Indexes and Full‑Table Scans.  

  • Is Node still cool? The March Towards Go: The more I’ve been working with distributed systems, the more I’m frustrated by Node’s direction, which favors performance over usability and robustness. In the past week I’ve rewritten a relatively large distributed system in Go, and it’s robust, performs better, it’s easier to maintain, and has better test coverage since synchronous code is generally nicer and simpler to work with.

  • You might like these Sysadmin Casts - simple bite sized sysadmin screencasts (released weekly). I watched a few and they seemed well done.

  • Cory Isaacson on Why is my database slowing down? Table scans, concurrency contention, slow writes.

  • Not that people believe these things, but here's a Performance comparison of EC2, Rackspace, Softlayer, GCE, Azure, DigitalOcean. It's quite detailed and you of course can buy the entire 120 page report, but that's probably a lot cheaper than running your own tests. EC2 barely won the web server test. Rackspace won the database test, except for the larget database test where Azure won. The large web server random read tests was won by Amazon EC2 and Rackspace. And so on. Amazon did well. Google didn't kick the mighty butt you might have thought given the performance is their value prop.

  • Sometimes it seems like it's caching all the way down. Here's a good Cache coherency primer which is not about memcache and CAP, but about how microprocessors keep data shared but consistent.

  • Elasticsearch at HipChat: 10x faster queries: So the big takeaway from this experience for us was that while Elasticsearch dynamic mapping is great for getting you started quickly, it can handcuff you as you as you scale.  All of our new projects with Elasticsearch use explicit mapping templates so we know our data structure and can write queries that take advantage of them. We expect to see far more consistent and predictable performance as we race towards 10 billion messages stored.

  • I/O Stack Optimization for Smartphones: We examined the performance trade-offs for each combination of the five database journaling modes, five filesystems and three optimization techniques. When we applied three optimization techniques in existing Android I/O stack, the SQLite performance (inserts/sec) improved by 130%. With the F2FS filesystem, WAL journaling mode (SQLite), and the combination of our optimization efforts, we improved the SQLite performance (inserts/sec) by 300%, from 39 ins/sec to 157 ins/sec, compared to the stock Android I/O stack.

  • Unioning of the Buffer Cache and Journaling Layers with Non-volatile Memory:  We implement our scheme on Linux 2.6.38 and measure the throughput and execution time of the scheme with various file I/O benchmarks. The results show that our scheme improves I/O performance by 76% on average and up to 240% compared to the existing Linux buffer cache with ext4 without any loss of reliability.

  • It's book time. If you like time travel genre science fiction then City Beyond Time: Tales of the Fall of Metachronopolis by John C. Wright, will mess with your mind in a good way. Highly recommended.