Entries from March 18, 2012 - March 24, 2012


Stuff The Internet Says On Scalability For March 23, 2012

Plop, Plop, Fizz, Fizz, Oh, What a HighScalability it is:

  • $1.5 billion: The cost of cutting London-Toyko latency by 60ms; 9 days: It took AOL 9 years to hit 1 million users. Facebook 9 months. Draw Something 9 days;  ~362 sq ft solar array: powers 1 sq ft of data center.
  • Is Amazon is trying to margenalize OpenStack by partnering with Eucalyptus?
  • As the DevOps turns. You won't see this on TMZ. Adrian Cockcroft:  There is no central control, the teams do it for themselves in the cloud; John Allspaw: s/NoOps/OpsDoneMaturelyButStillOps/g; Edward Capriolo: Trust developers not. Good thing is we all agree DevOps is necessary, the differences are in the how and whom.
  • In a word (or two), Wordnik has gone cloud. Gone is their big iron, in are envious EC2 instances. Driving the move was HA in multi-datacenters, elasticity for traffic bursts, and incremental cluster upgrades. There has been almost no reduction in performance.
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...


Paper: Revisiting Network I/O APIs: The netmap Framework

Here's a really good article in the Communications of the ACM on reducing network packet processing overhead by redesigning the network stack: Revisiting Network I/O APIs: The Netmap Framework by Luigi Rizzo. As commodity networking performance increases operating systems need to keep up or all those CPUs will go to waste. How do they make this happen?



Today 10-gigabit interfaces are used more and more in datacenters and servers. On these links, packets flow as fast as one every 67.2 nanoseconds, yet modern operating systems can take 10-20 times longer just to move one packet between the wire and the application. We can do much better, not with more powerful hardware but by revising architectural decisions made long ago regarding the design of device drivers and network stacks.

The netmap framework is a promising step in this direction. Thanks to a careful design and the engineering of a new packet I/O API, netmap eliminates much unnecessary overhead and moves traffic up to 40 times faster than existing operating systems. Most importantly, netmap is largely compatible with existing applications, so it can be incrementally deployed.

Click to read more ...


The Conspecific Hybrid Cloud

When you’re looking to add new tank mates to an existing aquarium ecosystem, one of the concerns you must have is whether a particular breed of fish is amenable to conspecific cohabitants. Many species are not, which means if you put them together in a confined space, they’re going to fight. Viciously. To the death. Responsible aquarists try to avoid such situations, so careful attention to the conspecificity of animals is a must.

Now, while in many respects the data center ecosystem correlates well to an aquarium ecosystem, in this case it does not. It’s what you usually get, today, but its not actually the best model. That’s because what you want in the data center ecosystem – particularly when it extends to include public cloud computing resources – is conspecificity in infrastructure.

This desire and practice is being seen both in enterprise data center decision making as well as in startups suddenly dealing with massive growth and increasingly encountering performance bottlenecks over which IT has no control to resolve.

Click to read more ...


LinkedIn: Creating a Low Latency Change Data Capture System with Databus

This is a guest post by Siddharth Anand, a senior member of LinkedIn's Distributed Data Systems team.

Over the past 3 years, I've had the good fortune to work with many emerging NoSQL products in the context of supporting the needs of a high-traffic, customer facing web site.

In 2010, I helped Netflix to successfully transition its web scale use-cases from Oracle to SimpleDB, AWS' hosted database service. On completion of that migration, we started a second migration, this time from SimpleDB to Cassandra. The first transition was key to our move from our own data center to AWS' cloud. The second was key to our expansion from one AWS Region to multiple geographically-distributed Regions -- today Netflix serves traffic out of two AWS Regions, one in Virginia, the other in Ireland (F1). Both of these transitions have been successful, but have involved integration pain points such as the creation of database replication technology.

In December 2011, I moved to LinkedIn's Distributed Data Systems (DDS) team. DDS develops data infrastructure, including but not limited to, NoSQL databases and data replication systems. LinkedIn, no stranger to building and open-sourcing innovative projects, is doubling down on NoSQL to accelerate its business -- DDS is developing a new NoSQL database called Espresso (R1), a topic for a future post.

Having observed two high-traffic web companies solve similar problems, I cannot help but notice a set of wheel-reinventions. Some of these problems are difficult and it is truly unfortunate for each company to solve its problems separately. At the same time, each company has had to solve these problems due to an absence of a reliable open-source alternative. This clearly has implications for an industry dominated by fast-moving start-ups that cannot build 50-person infrastructure development teams or dedicate months away from building features.

Change Data Capture Systems

Click to read more ...