hot links

Stuff The Internet Says On Scalability For September 11th, 2015

High Scalability

11 Sep 2015 — 10 min read

Hey, it's HighScalability time:

Need a challenge? Solve the code on this 17.5 feet tall 11,000 year old wooden statue!

$100 million: amount Popcorn could have made from criminal business offers; 3.2-gigapixel: World’s Most Powerful Digital Camera; $17.3 trillion: US GDP in 2014; 700 million: Facebook time series database data points added per minute; 300PB: Facebook data stored in Hive; 5,000: Airbnb EC2 instances.

Quotable Quotes:
- @jimmydivvy: NASA: Decade long flight across the solar system. Arrives within 72 seconds of predicted. No errors. Me: undefined is not a function
- Packet Pushers~ Everyone has IOPS now. We are heading towards invisible consumption being the big deal going forward.
- Randy Medlin: Gonna drop $1000+ on a giant iPad, $100 on a stylus, then whine endlessly about $4.99 drawing apps.
- Anonymous: Circuit Breaker + Real-time Monitoring + Recovery = Resiliency
- Astrid Atkinson: I used to get paged awake at two in the morning. You go from zero to Google is down. That’s a lot to wake up to.
- Todd Waters~ In 1979, 200MB weighed 30 lbs and took up the space of a washing machine
- Todd Waters~ CERN spends more compute power throwing away data than storing and analyzing it
- Rob Story: We've clearly reached the point where SSD/RAM bandwidth have completely outpaced CPU compute.
- Shedding light on the era of 'dark silicon': We will soon live in an era where perhaps more than 80 per cent of computer processors' transistors must be powered off and 'remain dark' at any time to prevent the chip from overheating.
- @diogomonica: In a container world, when someone asks about A vs B, the answer is always, A on top of B. #softwarecircus
- Mike Curtis (Airbnb)~ 70 percent of the people who put space up for rent on Airbnb in New York City say they do so because if they didn’t, they would lose their apartments or homes
- Mike Curtis (Airbnb)~ it would probably be on the order of 20 percent to 30 percent more expensive to operate its own datacenters than rent capacity on AWS
- @cloud_opinion: If John McAfee gets elected as President once, it will be impossible to uninstall him.
- @bradurani: The greatest trick the ORM ever pulled was convincing the world the DB doesn't exist... and it's a disaster for a generation of devs
- @coderoshi: The idea that management is the higher rung of a programmer's career ladder is like thinking that every actor wants to become a director.
- @HiddenBrain: MT @CBinsights: A million guys walk into a Silicon Valley bar. No one buys anything. Bar is declared a massive success.
- @Carnage4Life: Every time a developer says "temporary workaround" I remember this list.

Some impressive gains by migrating from Python to Go. From Python to Go: migrating our entire API: reducing the mean response time of an API call from 100ms to 10ms...We reduced the number of EC2 instances required by 85%...we can now ship a self-hosted version of Repustate that is identical to the one we host for customers.

Rob Story has an awesome summary of the goings on at the Very Large DataBases Conference. His main gloss is at VLDB 2015: Concurrency, DataFlow, E-Store, but he also has a day by day summaries up on github. An amazing job and lots of concentrated insight.

Really wonderful article. A Life in Games: The Playful Genius of John Conway. Packed with slices of life that make me feel like I would like a little more John Conway in my life.

Need high availability? Here's how eBay uses Netflix Hystrix to implement the Circuit Breaker pattern. An example is given for their Secure Token service. Hystrix: a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.

Rob Pike and Naitik Shah are speaking at the Fall Gopherfest - Silicon Valley on November 18th. It's free and you might find it useful.

Sounds a lot like a distributed database? First new cache-coherence mechanism in 30 years could enable chips with thousands of cores: Whereas with existing techniques, the directory’s memory allotment increases in direct proportion to the number of cores, with the new approach, it increases according to the logarithm of the number of cores...What we do is we just assign time stamps to each operation, and we make sure that all the operations follow that time stamp order.

LinkedIn moved their domain to an anycast IP address. Anycast IP addresses are an exotic beast and this is a great introduction to how they work and their advantages. What: With anycast, the same IP address can be assigned to n servers, potentially distributed across the world. The internet's core routing protocol, BGP, would then automatically route packets from the source to the closest (in hops) server. Result: We ramped U.S. on regional anycast earlier this year, and the overall suboptimal PoP assignment dropped from 31% to 10%. While this is a significant gain, we are still investigating why the remaining 10% are not optimally assigned.

Expansive story on How SoundCloud ended up with microservices. There's a lot of nuance in the article, but here are some key points. They paired back-end and front-end developers and make this pair be fully dedicated to a feature until its completion; they moved from a monolith to services using Clojure and JRuby, eventually moving to Scala and Finagle; teams were given ownership of modules; microservices helped with deployment flexibility and made it easier to organize teams.

Many thought AOL was the Internet. Facebook is tyring to make it so. First by hosting content locally on Facebook. Now there's Facebook's master plan to earn more money from businesses just took its next step: It's adding new call-to-action buttons that will let businesses encourage potential customers to do things like book appointments at a spa, peruse a resturant's menu, or browse the products a store offers.

Even the most advanced managed services can't fix everything. Medium found that out with DynamoDB's inability to tune effectively for hotspots. Here's how Medium detects scaling and hotspot problems on DynamoDB using ElasticSearch, Logstash and Kibana. It's difficult when using a managed service because: "it takes the Read Capacity Unit (RCU) and Write Capacity Unit (WCU) values and divides them by the number of partitions your table has — without ever exposing how many partitions your table has." On hot keys increaseing the RCU will probably not help. To find hot spots Medium created their own access log to detect when a key is accessed to often.

There's more than one way to do it. Lambda+: Cassandra and Spark for Scalable Architecture: With Cassandra & Spark we can build something that achieves the same goals as the Lambda Architecture but more simply and with fewer moving pieces by combining your Speed Layer and your Batch Layer into a single data store running on Cassandra and utilizing Spark and Spark Streaming to have a single code base responsible for Analytics And Streaming.

Lyft took the plunge and rewrote its iOS app in Swift. Result: it took a third of the code to implement the same functionality; Swift still is relatively unstable; Apple doesn't have a way to rollout to a percentage of users for testing; people were able to do things much faster; it worked.

Google Refines Cloud Pricing Around Capacity Lessons: The Preemptible VM pricing model is a game changer for a seed-funded startup like ours, because of the significant cost reduction. We’re excited to continue using them in the future as we increase the amount of data we process to identify and determine the health of global crops.

The New Stack has A New Podcast From TNS All About Scaling Dynamic Services and Systems that might be of interest.

Airbnb Shares The Keys To Its Infrastructure: Airbnb has around 5,000 EC2 instances running on AWS, most of them being reserved instances just to keep things simple. About 1,500 of those EC2 instances are deployed for the web-facing parts of its applications and the remaining 3,500 being used for various kinds of analytics and machine learning algorithms that drive the business...Everything that we do in engineering is about creating great matches between people...The core indexing technology that we use is Lucene...On top of the HDFS files, Airbnb creates a data warehouse using Hive and also uses the Presto SQL query engine...Airbnb uses Chef for configuration management and a bunch of homegrown tools...Airbnb is not using Mesos very much today...every six months the IT team does an analysis to peg its compute and storage capacity and costs against what it would cost to bring it all in house...it would probably be on the order of 20 percent to 30 percent more expensive to operate its own datacenters than rent capacity on AWS...the percentage of time I spent on that versus the percentage of time I spent on making Yahoo Mail a better product, and it might have been 50-50. Maybe it should have been 5-95.

A most excellent explanation with very helpful visualizations. Understanding LSTM Networks. Also, a free ebook on Neural Networks and Deep Learning.

Dan Rayburn, the brains behind StreamingMedia.com, really know his stuff and now he has a free book you might be interested in: The Business of Streaming and Digital Media: The kind of technology used to deliver audio and video, be it streaming, download, live, on-demand etc. really no longer matters. It’s about using the right mix of multiple distribution technologies to reach the right audience with the right type of content and that’s what this book is all about – applying the right business models.

Lots of good points. You're probably wrong about caching. But they are solveable and the performance problems caching solves generally are not.

While I like the vision, the problem is it's not a static content world. There's a constant war between the static and the dynamic, the centralized and the decentralized. HTTP is obsolete. It's time for the distributed, permanent web. A decentralized distributed database + a lambda style behavioral layer might have a better chance.

How do you program this thing? An Exascale Address Space: his presentation will present a framework for a 128-bit address space. The address space is NOT FLAT, and specifies various domain based addressing and protection mechanisms. There will also be comparisons with capability based addressing approaches.

A good book on A Beginner’s Guide to Website Speed Optimization.

Powering Flickr’s Magic view by fusing bulk and real-time compute: In this post we’re going to talk about how we came up with a novel revision of the Lambda Architecture for fusing large-scale bulk compute with streaming compute to power Flickr’s Magic View. We were able to create a responsive, real time database operating at a scale of tens of billions of records, with tens to hundreds of millions of records updated per day.

In a streaming world you need to adapt your algorithms to work on data streams. T-digest and streaming k-means are Some Important Streaming Algorithms You Should Know About from Ted Dunning: The key thing about streaming algorithms is they have to be approximate algorithms. There are a few things that you can compute exactly in a streaming fashion, but there are lots of important things that you can't compute that way, so we have to approximate.

Great in-depth example of State Synchronization in gaming.

Using data to optimize TCP for mobile. Lessons Learned: TCP slow start doesn’t have to be so slow: "our challenge now is to use as many pieces of information as we can glean from the user's context (of which there are many on a mobile device), to make as accurate a first estimate for the starting value [of the maximum segment size] as we can. This is but one of the ways a more mobile focused protocol can vastly improve performance over the general purpose TCP or even the newly developed QUIC." Also, Mobile TCP optimization - lessons learned in production.

Dick Sites - Data Center Computers: Modern Challenges in CPU Design. metaconcept with the tl;dr: he's trying to find out the causes behind 99th percentile performance problems, i.e. when most things take 50ms, but 1% take 800ms.Datacentre CPUs don't have enough memory bandwidth, all the way from L1 to DRAM. Profiling tools change the way the program performs. Locks and stuff when running a distributed application - I skipped over this as it got boring. Disk performance and stuff, discussion of a severe performance problem caused by (warning! spoiler!) CPU throttling.

1.86x - 3.92x performance improvement using a custom memory allocator. Job System 2.0: Lock-Free Work Stealing – Part 2: A specialized allocator.

An interesting history of modern init systems (1992-2015). Most complex applications have their own init type system. It might be interesting to include those as well.

Very cool work. Artillery’s Native Game Client: After some clever tricks and a lot of hard work we’ve made it over these technical hurdles. We now have a production-quality engine and toolchain for developing a real-time, multiplayer RTS. We rewrote critical code paths in C++ and compiled them with Emscripten (often yielding 10x speed-ups), and we made our game multi-threaded with WebWorkers. As of this writing, Atlas currently runs on my gaming PC at 144 fps in Google Chrome without plugins and with audio/visual effects as well as environment and character art. Unfortunately, despite this benchmark, browsers still fall short in delivering an optimal player experience which will satisfy the demands of consumers, especially gamers.

This gives you a really good feel for solutions in different languages. Programming Shootout: Objects vs Boxes vs Actors vs Agents: "Without doubt, agent oriented modelling adds overhead to the programming effort. Programs are very often implementable with just plain Erlang actors, Java objects or Haskell etc." Given that the example problem was a calculator the conclusion may not be all that generalizable.

Humus: Humus is a pure actor-based programming language that provides a foundation for software developers to build reliable concurrent computer systems. It features a referentially transparent pure-functional core for defining values. These values become the messages between (and specify the behavior of) dynamically configured actors.

Coordination Avoidance in Database Systems: Minimizing coordination, or blocking communication between concurrently executing operations, is key to maximizing scalability, availability, and high performance in database systems. However, uninhibited coordination-free execution can compromise application correctness, or consistency. When is coordination necessary for correctness?

Gorilla: A Fast, Scalable, In-Memory Time Series Database: In this paper we introduce Gorilla, Facebook’s inmemory TSDB. Our insight is that users of monitoring systems do not place much emphasis on individual data points but rather on aggregate analysis, and recent data points are of much higher value than older points to quickly detect and diagnose the root cause of an ongoing problem.

Cost-based Memory Partitioning and Management in Memcached: In this work, we proposed a cost-based memory management mechanism for Memcached – a widely adopted inmemory storage system used as a fundamental building block in many modern web architectures – that is able to dynamically
provide a cost-based hit ratio close to optimal. Our scheme works on-line and is able to adapt to the characteristics of the objects that are requested, while other solutions statically allocate the memory to the different classes, thus obtaining sub-optimal performance.

tidb: a distributed SQL database. Inspired by the design of Google F1, TiDB supports the best features of both traditional RDBMS and NoSQL.

OOSMOS: Object-Oriented State Machine Operating System. It is a new operating system where the fundamental contextual unit is the object, not the thread as it is in traditional operating systems.

Because there are no threads, there are no thread stacks, so OOSMOS is ideal for use in memory constrained environments where a traditional thread-based operating system is not a viable option.

Stuff The Internet Says On Scalability For September 11th, 2015

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale