Stuff The Internet Says On Scalability For April 19, 2013
Hey, it's HighScalability time:
(Ukrainian daredevil scaling buildings)
- Two Trillion Objects, 1.1 Million Requests / Second: S3; 1.4TB/s: Titan supercomputer has world’s fastest storage; four billion hours: Netflix streaming in last 3 months; $1.2B: Google's Q1 infrastructure spend
- Quotable Quotes:
- Google: We'll track EVERY task on EVERY data center server
- Stacey Higginbotham: All in all in the last five years the world has gained 54 Tbps of new capacity.
- @seveas: Scalability 103: Hardware sucks. Software sucks. Everything *will* break, prepare for failure of any component of your system.
- bloodredsun: The long and short of it is that Cassandra is a fantastic system for write heavy situations. What it is not good at are read heavy situations where deterministic low latency is required, which is pretty much what the pinterest guys were dealing with.
- @viktorklang: "The e-mail message could not be delivered because the user's mailfolder is full." <-- EMAIL HAS BACKPRESSURE OMG
- Interesting Behind the Scenes: Airbnb Neighborhoods. Includes a description of their work flow and a detailed breakdown of their stack: Rails, PostgreSQL/PostGIS, Memcached, CoffeeScript, Sass, jQuery, Handlebars, Backbone, Underscore, Sinatra, Clojure, Java, Hadoop, Cascalog. Highlight: "You don't need a database, you need a [expletive deleted] cache" So that's what we did, we traded our database for a cache.
- Curious about Curiosity's FSW Architecture? Then this is the video for you. Extreme for most applications, but still a lot of lessons to learn about structure and reliability.
- OK, this is Terminator kind of scary: Engineers Use Brain Cells to Power Smart Grid.
- Succeeding at success is the hardest skill of all. 5 Critical Errors That Triggered Ron Johnson’s Removal at JC Penney: “He Misread What Shoppers Want; He Didn’t Test Ideas in Advance; He Alienated Core Customers; He Totally Misread the JC Penney Brand; Overall, He Didn’t Seem to Like or Respect JC Penney.
- Jeremy Cole with a lovingly obsessive series of articles on InnoDB internals, structures, and behavior. Want details? We got your details right here.
- Randy Bias gives his State of the Stack April 2013 address. The state is strong and growing, with a lot of work still to do.
- Groupon with some good advice on keeping the caches of a MySQL standby server hot using slow logs and lessons learned when logging a high volume of queries to the slow log. Rotating MySQL Slow Logs Safely. Also, Is Your MySQL Buffer Pool Warm? Make It Sweat!
- Luke with Everything developers need to know about SQL performance. Excellent coverage. Both wide and deep.
- 7 handy SQL features for data scientists: Generating queries from a query; Basic date operations; Text Mining; Median aggregate function; \COPY to load data into your database; Generating sequences; Assorted things you should know.
- Anyone who has visited an airport has probably thought they could do baggage handling so much better. Well, there are people for whom that's their job: Scalability and baggage-handling. And it can be done better. We just don't do it the better way. Surprised?
- Planning for Toy Story and Synthetic Biology: It's All About Competition: it is clear that DNA sequencing platforms are improving very rapidly, now much faster than Moore's Law. Biological Technologies are Hard to Predict in Part Because They Are Cheaper than Chips.
- Ah, some good memories...The Entire Run Of Omni Magazine Is Available Online For Free.
- It's always the butler or the firewall...Army scalability test finds login bottleneck: Real-time inspection of the logins indicated messages were queuing – getting backed up — by the firewall.
- Ultra-Fast Computing: Researchers Evaluate Bose-Einstein Condensates for Communicating Among Quantum Computers. What, it won't use Ethernet?
- Nginx Support Enables Massive Web Application Scaling. There are three common use-cases where Nginx really stands out: As a reverse proxy / cache in front of Apache; As a reverse proxy /cache in front of an Application Server or Framework; As a replacement for Apache and mod_php.
- Scalability at the Cost of Availability: One subtle concept that is sometimes misunderstood is that if not careful an increase in scalability can actually decrease your availability. In order to understand how this can happen we need to talk about the multiplicative affect of failure with items in series.
- Hans-J. Boehm with a good article on Threads Basics. For another good foundation article take a look at Essentials of Garbage Collection. And Hacking Secret Ciphers with Python is free to download and looks quite good. Also, A non-mathematical explanation of one way functions. And Back-to-Basics Weekend Reading - Join Processing in Relational Databases.
- Mind the gap...New software could alleviate wireless traffic: The software lets these devices that can't normally talk to one another exchange simple stop and warning messages so their communications collide less often. GapSense creates a common language of energy pulses and gaps. The length of the gaps conveys the stop or warning message.
- Ricky Ho with a good overview on Scalable System Design.
- Paul Gross on Uptime == Money: High Availability at Braintree. Focuses on RoR, pausing traffic, rolling deploys, load balancing, request retries, and other smart subjects.
- TimeStream: Reliable Stream Computation in the Cloud: TimeStream handles an on-line advertising aggregation pipeline at a rate of 700,000 URLs per second with a 2-second delay, while performing sentiment analysis of Twitter data at a peak rate close to 10,000 tweets per second, with approximately 2-second delay.