« 1 Aerospike server X 1 Amazon EC2 instance = 1 Million TPS for just $1.68/hour | Main | Hamsterdb: An Analytical Embedded Key-value Store »

Stuff The Internet Says On Scalability For August 15th, 2014

Hey, it's HighScalability time:

Somehow this seems quite appropriate. (via John Bredehoft)
  • 75 acres: Pizza eaten in US daily; 270TB: Backblaze storage pod; 14nm: Intel extends Moore's Law
  • Quotable Quotes
    • discreteevent: The dream of reuse has made a mess of many systems.
    • David Crawley: Don't think of Moore's Law in terms of technology; think of it in terms of economics and you get much greater understanding. The limits of Moore's Law is not driven by current technology. The limits of Moore's Law are really a matter of cost.
    • Simon Brown: If you can't build a monolith, what makes you think microservices are the answer?
    • smileysteve: The net result is that you should be able to transmit QPSK at 32GBd in 2 polarizations in maybe 80 waves in each direction. 2bits x 2 polarizations x 32G ~128Gb/s per wave or nearly 11Tb/s for 1 fiber. If this cable has 6 strands, then it could easily meet the target transmission capacity [60TB].
    • Eric Brumer: Highly efficient code is actually memory efficient code.

  • How to be a cloud optimist. Tell yourself: an instance is half full, it's not half empty; Downtime is temporary; Failures aren't your fault.

  • Mother Earth, Motherboard by Neal Stephenson. Goes without saying it's gorgeously written. The topic: The hacker tourist ventures forth across the wide and wondrous meatspace of three continents, chronicling the laying of the longest wire on Earth. < Related to Google Invests In $300M Submarine Cable To Improve Connection Between Japan And The US.

  • IBM compares virtual machines and against Linux containers: Our results show that containers result in equal or better performance than VM in almost all cases. Both VMs and containers require tuning to support I/O-intensive applications.

  • Does Psychohistory begin with BigData? Of a crude kind, perhaps. Google uses BigQuery to uncover patterns of world history: What’s even more amazing is that this analysis is not the result of a massive custom-built parallel application built by a team of specialized HPC programmers and requiring a dedicated cluster to run on: in stark contrast, it is the result of a single line of SQL code (plus a second line to create the initial “view”). All of the complex parallelism, data management, and IO optimization is handled transparently by Google BigQuery. Imagine that – a single line of SQL performing 2.5 million correlations in just 2.5 minutes to uncover the underlying patterns of global society.

  • Fabian Giesen with an deep perspective on how communication has evolved to use a similar pattern. Networks all the way down (part2): anything we would call a computer these days is in fact, for all practical purposes, a heterogeneous cluster made up of various specialized smaller computers, all connected using various networks that go by different names and are specified in different standards, yet are all suspiciously similar at the architecture level; a fractal of switched, packet-based networks of heterogeneous nodes that make up what we call a single “computer”. It means that all the network security problems that plague inter-computer networking also exist within computers themselves. Implementations may change substantially over time, the interfaces – protocols, to stay within our networking terminology – stay mostly constant over large time scales, warts and all.

  • In a non-descript conference room in a time far far away, a group of engineers argued amongst themselves, and eventually spec'ed that the TCAM in their router will never have to handle more than 512,000 BGP routes. How could we ever need more than that?, they say to themselves. Not for the first time has the past not kept up with the future. BGP Routing Table Size Limit Blamed for Tuesday’s Website Outages.

  • Given how many Java programmers fight and lose to GC, eventually moving to off heap memory, I'm surprised this doesn't happen more. Why I am reverting from Java to C++: So I have started going backwards. Writing programs in c++. Falling in love with template functions and finding them more powerful then java's generics. Using lambdas in c++11 and saying, "what is the big deal with scala?". Using smart pointers in boost when I need to, freeing memory by hand when I do not.

  • Nextdoor Taskworker: simple, efficient & scalable. Thorough explanation of "distributed task queueing system which processes millions of asynchronous tasks daily."  RabbitMQ was replaced by Amazon SQS.  Celery for task workers had a lot of problems (unstable, low utilization, high latency) was replaced with their own Taskworker creation. A set of Taskworker processes on each worker node. Taskworker pulls batches of tasks from SQS queues. Responsibility of ensuring idempotence is given to developers of tasks. Simulated production workloads accurately specified how to provision worker capacity at different periods throughout the day. No stability issues with hung workers or crashes. Average task-in-queue latency  improved by 40x during busy hours thanks to the true priority implementation of it that makes resource utilization efficient.

  • Pizza-As-A-Service helps explain today’s cloud. Genius. 

  • Martin Fowler on Microservices and the First Law of Distributed Objects: So in essence, there is no contradiction between my views on distributed objects and advocates of microservices. < I will say in practice CORBA systems quickly evolved into service based systems, not distributed objects. 

  • Goldfire Studios on Horizontally Scaling Node.js and WebSockets with Redis: we've broken our app into 3 different layers: load balancer on the front, Node app in the middle and a system to relay messages to connect it all together. Ultimately, we selected node-http-proxy because we felt most comfortable customizing it for our needs (our stack is essentially 100% Javascript) while maintaining a sufficient level of performance. [For messaging] nothing beat the simplicity and raw speed of Redis (not to mention it can double as a session store, etc).

  • Russell Sullivan with insightful commentary on Google's Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing: The versioning, especially the multi datacenter versioning is smart. They serialise updates in any given data center and create a version for that update batch and then use paxos to communicate with other data centers if this update batch version has been processed by a majority and if so, this becomes the new committed version. It is a smart decoupled way to do geo-replication, but only some systems can afford to do it, those where latency is not critical to correctness. Other than the versioning and some aggressive corruption detection, Mesa seems like a very well thought out incremental improvement to Google's analytics offerings.

  • What is the Atomic Unit of Computing?: More specific to containers specifically, however, is the steady erosion in the importance of the operating system. To be sure, packaged applications and many infrastructure components are still heavily dependent on operating system-specific certifications and support packages. But it’s difficult to make the case that the operating system is as all powerful as it was.

  • How to speed up your ORM queries of n x m associations: Rails shows us hints that the query per se is not the problem anymore but the deserialization. What if we try to limit our created object graph and use a model class backed by a database view? The lesson here is that sometimes the performance can be improved outside of our code and that mapping database results to objects is a costly operation.

  • YapDatabase: a "key/value store and MUCH MORE" built atop sqlite for iOS & Mac. It provides concurrency, caching, collections, metadata, views, secondary indexing, full text search, relationships, extensions, performance.

  • Nice review of PaaS shoot-out: Cloud Foundry vs. OpenShift: Cloud Foundry shines with broad application support and stellar ease of use, but OpenShift has the edge in management and automation.

  • A new offline-first open web API. ServiceWorker: the API that gives you full control over HTTP caching, request, and forms the basis for push messaging, alarms, geofencing and background sync.

  • Maybe if we had this for program documentation would it be better? Contests that challenge young scientists to explain their research without jargon are turning science communication into a competitive sport.

  • CoreOs: designed for security, consistency, and reliability. Instead of installing packages via yum or apt, CoreOS uses Linux containers to manage your services at a higher level of abstraction. A single service's code and all dependencies are packaged within a container that can be run on one or many CoreOS machines.

  • A million spiking-neuron integrated circuit with a scalable communication network and interface: Inspired by the brain’s structure, we have developed an efficient, scalable, and flexible non–von Neumann architecture that leverages contemporary silicon technology. To demonstrate, we built a 5.4-billion-transistor chip with 4096 neurosynaptic cores interconnected via an intrachip network that integrates 1 million programmable spiking neurons and 256 million configurable synapses. 

  • ATraPos: Adaptive Transaction Processing on Hardware Islands: In this paper, we propose ATraPos, a storage manager design that is aware of the non-uniform access latencies of multisocket systems. ATraPos achieves good data locality by carefully partitioning the data as well as internal data structures (e.g., state information) to the available processors and by assigning threads to specific partitions. Furthermore, ATraPos dynamically adapts to the workload characteristics, i.e., when the workload changes, ATraPos detects the change and automatically revises the data partitioning and thread placement to fit the current access patterns and hardware topology.

Reader Comments (3)

The "Mother Earth, Motherboard" link is broken, and should be http://archive.wired.com/wired/archive/4.12/ffglass_pr.html

August 15, 2014 | Unregistered CommenterJohn Long

Thanks John, fixed.

August 15, 2014 | Registered CommenterTodd Hoff

Wikimedia's infrastructure has something it calls "Pool Counter" (https://wikitech.wikimedia.org/wiki/PoolCounter). It's used to avoid a thundering herd condition that often occurs during large political events or celebrity deaths. It was introduced after Michael Jackson's death, which caused a full infrastructure outage for Wikimedia.

MediaWiki (the software used for Wikipedia, and other Wikimedia sites) uses wikitext as its markup language. When an edit occurs, the wikitext is parsed into HTML and is cached in the backend parser cache (Memcache, backed by MySQL for cross-datacenter replication). Additionally, edits cause immediately invalidations to the frontend caches (Varnish). It's necessary to immediately invalidate caches to properly support anonymous editing, otherwise an editor would see a typo, for instance, then would click to edit and wouldn't find the typo, which would be a frustrating experience. When caches are flushed, the next request to the article causes the text to be re-parsed and re-cached. Since this process is not immediate, a large number of requests can trigger re-parses after an edit event.

To avoid MediaWiki server process starvation from parse requests, MediaWiki will increase the article's count in the pool counter when an article is being re-parsed. If the count reaches a certain threshold, requests that don't result in a parser cache hit will return an error for that article rather than causing a cascading outage for all of Wikimedia's services. You may have noticed that during the same period of time Wikipedia was as responsive as it normally is.

Of course, it would be better to simply show stale cache to users than to throw errors, but this occurs infrequently enough that there hasn't been a lot of effort put into handling that.

August 18, 2014 | Unregistered CommenterRyan Lane

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>