Stuff The Internet Says On Scalability For June 28, 2013

Hey, it's HighScalability time:


(Leandro Erlich's super cool scaling illusion)

  • Who am I? I have 50 petabytes of data stored in Hadoop and Teradata, 400 million items for sale, 250 million queries a day, 100,000 pages served per second, 112 million active users, $75 billions sold in 2012...If you guessed eBay then you've won the auction.
  • Quotable Quotes:
    • Controlled Experiments at Large Scale: Bing found that every 100ms faster they deliver search result pages yields 0.6% more in revenue
    • Luis Bettencourt: A city is first and foremost a social reactor. It works like a star, attracting people and accelerating social interaction and social outputs in a way that is analogous to how stars compress matter and burn brighter and faster the bigger they are.
    • @nntaleb: unless you understand that fat tails come from concentration of errors, you should not discuss probability & risk 
  • Need to make Hadoop faster? Hadoop + GPU: Boost performance of your big data project by 50x-200x? Or there's Spark, which uses in-memory techniques to run upto 100x faster than Hadoop. Also, Spark: Open Source Superstar Rewrites Future of Big Data

  • Human, you aren't so special after all. Even plants can do math: During the night, mechanisms inside the leaf measure the size of the starch store and estimate the length of time until dawn. Information about time comes from an internal clock, similar to our own body clock. The size of the starch store is then divided by the length of time until dawn to set the correct rate of starch consumption, so that, by dawn, around 95% of starch is used up.

  • Filmgrain talks about their architecture. All Redis all the time. Redis does quad duty as a cache, queue pub/sub, and database. Each component is isolated through Redis. 

  • When key-value and relational are not enough, Facebook went graph with TAO, a custom distributed service designed around objects and associations, used for many features including likes, pages, and events. It serves thousands of data types and handles over a billion read requests and millions of write requests every second. Most of the complexity is kept in the client, it doesn't perform typical graph algorithm operations on the server. This simplicity "helps product engineers find an optimal division of labor between application servers, data store servers, and the network connecting them." It's geographically distributed, eventually consistent, organized as a tree, single master per shard.

  • Will the amazing capabilities of the seemingly simple slime mould never end? Now it could make memristors for biocomputers:  Slime mould can be used to perform all the logic functions that conventional computer hardware components can do.

  • In the concurrent connection game,  MigratoryData says the can scale up to 12 million concurrent users from a single Dell PowerEdge R610 server

  • To make an omelette you have to break a few eggs. Same for software? The Antifragile Organization is an ACM article talking about how Netflix "embracing failure to improve resilience and maximize availability." Yes, you'll find your favorite Planet of the Apes characters, but there's also the idea of an antifragile organization: every engineer is an operator; each failure is an opportunity to learn; don't point the fickle finger of blame.

  • Azure now supports autoscaling. The number of instances is settable via the console as are threshold CPU load levels. Now all you need to do is make your app horizontally scalable.

  • Disks aint dead yet. StorageMojo: While someone, somewhere, will undoubtedly invest in an all-flash data center, very few businesses will go in that direction in the next 10 years. New storage systems that stress commodity hardware and scale out architectures can be looked at as horizontal layers rather than vertical stovepipes.

  • Slides and Videos for RICON East are now available. A lot of good stuff. I predict you might like Automatically Scalable Computation by Dr. Seltzer. Using prediction in programs to make the best use of a massive core future. Combines parallelization with machine learning pixie dust. Also, Optimizing LevelDB for Performance and Scale

  • It doesn't matter what your automation tool of choice is, Salt: Like Puppet, Except It Doesn’t Suck is a great discussion of the different angles developers take on the problem. No real winner. Just lots of good options. Given we had to build all this stuff from scratch not that long ago, that's a good thing.

  • Chaos is always just one bug away. A bug on the iPhone caused open TCP connections not to close. Very easy to do. Something probably every network programmer has done at one time. The effect when the bug is on zillions of phones? A DDoS attack on YouTube

  • Tomek Wójcik tackles tag design with Fun with PostgreSQL: Tagging blog posts. Arrays make so many things easier.

  • If you've ever wondered how Google machine learns on pictures then you might like Fast, Accurate Detection of 100,000 Object Classes on a Single Machine: We demonstrate the advantages of our approach by scaling object detection from the current state of the art involving several hundred or at most a few thousand of object categories to 100,000 categories requiring what would amount to more than a million convolutions. Moreover, our demonstration was carried out on a single commodity computer requiring only a few seconds for each image. The basic technology is used in several pieces of Google infrastructure and can be applied to problems outside of computer vision such as auditory signal processing. 

  • Beyond Silicon: Transistors without Semiconductors: Imagine that the nanotubes are a river, with an electrode on each bank. Now imagine some very tiny stepping stones across the river. The electrons hopped between the gold stepping stones. The stones are so small, you can only get one electron on the stone at a time. Every electron is passing the same way, so the device is always stable.

  • If Mobile is the Future then we should probably learn how to design for mobile. eBay found 90% of customer support calls were because they didn't have a "forgot password" line in their login. In 2011 3/4ths of shopping carts were abandoned. Expedia removed the Company field from a cart and the result was 12 million more in sales over night. People who use Amazon Prime go from $400 to $900 a year in sales because it's so easy. Mobile is magnifying lens for UI problems.

  • Bare-Metal Multicore Performance in a General-Purpose Operating System. James Aguilar with a good TLDR: we know whether we have any work to do in the kernel, and when the next work is. If there is no work to do now, and no known work to do in the future, there will never be any work to do in the future unless it is scheduled via some interrupt (explicit timer, keyboard input, the NIC, disk, etc.). In the meantime, you have tickless behavior.

  • Greg Linden cooked up a new batch of tasty Quick Links