Stuff The Internet Says On Scalability For March 28th, 2014

Hey, it's HighScalability time:


Looks like a multiverse, if you can keep it.

  • Quotable Quotes:
    • @abt_programming: "I am a Unix Creationist. I believe the world was created on January 1, 1970 and as prophesized, will end on January 19, 2038" - @teropa
    • @demisbellot: Cloud prices are hitting attractive price points, one more 40-50% drop and there'd be little reason to go it alone.
    • @scott4arrows: Dentist "Do you floss regularly?" Me "Do you back up your computer data regularly?"
    • @avestal: "I Kickstarted the Oculus Rift, what do I get?" You get a lesson in how capitalism works.
    • @mtabini: “$20 charger that cost $2 to make.” Not pictured here: the $14 you pay for the 10,000 charger iterations that never made it to production.
    • @strlen: "I built the original assembler in JS, because it's what I prefer to use when I need to get down to bare metal." - Adm. Grace Hopper
    • tedchs: I'd like to propose a new rule for Hacker News: only if you have built the thing you're saying someone should save money by building themselves, may you say the person should build that thing.
    • lamby: Bezos predicted they would be good over the long term but said that he didn’t want to repeat “Steve Jobs’s mistake” of pricing the iPhone in a way that was so fantastically profitable that the smartphone market became a magnet for competition.
    • @PariseauTT: I feel a Netflix case study coming...everybody get your drinks ready...#AWSSummit
    • seanmccann: That's no different than startups of the past having to pay thousands to millions of dollars to setup servers. These days those same servers can be setup in minutes for a fraction of the cost. Timing is everything. Sometimes the current market pricing for a commodity is too expensive to make your business viable today but in the future that will not be the case. Just depends how long that will take. 
    • Petyr 'Littlefinger' Baelish: Chaos isn't a pit. Chaos is a ladder. Many who try to climb it fail and never get to try again. The fall breaks them. And some, are given a chance to climb. They refuse, they cling to the realm or the gods or love. Illusions. Only the ladder is real. The climb is all there is.
  • We turn those who first climb the peaks of tall mountains into heros. But what of the mountain? Mariana Mazzucato The Entrepreneurial State: Debunking Private vs. Public Sector Myths: Every feature of the iPhone was created, originally, by multi-decade government-funded research. From DARPA came the microchip, the Internet, the micro hard drive, the DRAM cache, and Siri. From the Department of Defense came GPS, cellular technology, signal compression, and parts of the liquid crystal display and multi-touch screen (joining funding from the CIA, the National Science Foundation, and the Department of Energy, which, by the way, developed the lithium-ion battery.) CERN in Europe created the Web. Steve Jobs’ contribution was to integrate all of them beautifully.

  • This is not an April Fools' joke, as the name might make a certain sort of mind consider. WebScaleSQL: MySQL goes web scale with contributions from MySQL engineering teams at Facebook, Google, LinkedIn, and Twitter. It includes lots of good work on the test suite, performance enhancements, and features to make scaling easier. So many forks at the table (MariaDB, Percona, webscale).

  • This is why we can't have nice code. Martin Sústrik argues In the Defense of Spaghetti Code, quite successfully I think, that it's often clearer to have one large 1500 line function than it is to have a refactored great big ball of mud. Commenters argue from a perfect world stance that if you do X from the start and then continue the same practices over the years and and hundreds of programmers then code can be perfected. Not so. The nature of many domains is they are just messy. At a certain level of abstraction messiness can be hidden, but when you are in the guts of a thing, messiness is preserved. 

  • Warning: Double Wizard Zone. After looking at how stack frames are built during function prologues, Gustavo Duarte tackles the inverse process as stack frames are destroyed in function epilogues.

  • Wade into the design of the generic and very fast H2O Architecture. H2O is an "in-memory analytics on clusters with distributed parallelized state-of-the-art Machine Learning algorithms." Many interesting bits. It's peer-to-peer, distributes the JVM memory model, use compression to great advantage, guarantees on a distributed cluster that if memory is accessed linearly that access times match that of C.

  • How I Narrowed Down The Location Of Malaysia Air Using "Monte Carlo" Data Models. Even if it doesn't find the plane the process used to narrow down the search space is quite instructive.

  • How Facebook uses data to detecte and disrupt attacks. They can understand "where threats are coming from, arranged by type of attack, time, and frequency." Understanding Online Threats with ThreatData: a framework for importing information about badness on the Internet in arbitrary formats, storing it efficiently, and making it accessible for both real-time defensive systems and long-term analysis.

  • Ilya Grigorik addresses Why is my CDN 'slow' for mobile clients? : The problem is, while the statement is often based on real data (i.e. the relative performance improvements offered by a CDN are smaller for mobile clients), the conclusion is wrong: the absolute improvements are likely the same for all clients and hence worth every penny. Also, we don't need a "mobile CDN", we need carriers to fix their networks. < Good luck with that.

  • Basic Concepts of High Availability Linux. The article is, well, basic. But the Hacker News thread has more useful details.

  • Java memory layout and False Sharing. Doesn't is seem like a lot of the work in Java is about how to get around Java?

  • Some sound ideas. Five Ways to Scale your API Without Touching Your Code: 1) Design your API to fit what clients want and not require multiple trips; 2) call aggregation; 3) rate limit; 4) don't make calls with client side caching, 5) get clients to write good code.

  • Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU: We believe many factors contributed to the reported large gap in performance, such as which CPU and GPU are used and what optimizations are applied to the code. Optimizations for CPU that contributed to performance improvements are: multithreading, cache blocking, and reorganization of memory accesses for SIMD-ification. Optimizations for GPU that contributed to performance improvements are: minimizing global synchronization and using local shared buffers are the two key techniques to improve performance.

  • Orleans: Distributed Virtual Actors for Programmability and Scalability: The Orleans programming model introduces the novel abstraction of virtual actors that solves a number of the complex distributed systems problems, such as reliability and distributed resource management, liberating the developers from dealing with those concerns. At the same time, the Orleans runtime enables applications to attain high performance, reliability and scalability.