How to get started with sizing and capacity planning, assuming you don't know the software behavior?

Here's a common situation and question from the mechanical-sympathy Google group by Avinash Agrawal on the black art of capacity planning:

How to get started with sizing and capacity planning, assuming we don't know the software behavior and its completely new product to deal with?

Gil Tene, Vice President of Technology and CTO & Co-Founder, wrote a very understandable and useful answer that is worth highlighting:

Click to read more ...


22 Recommendations for Building Effective High Traffic Web Software

This is a guest post by Ashwanth Fernando, Software Engineer from the trenches at large scale internet companies.

Inspired by the book "Effective Java" by Joshua Bloch, I wanted to share my holistic recommendations on building high traffic web software (i.e. web applications/services that serve high traffic loads). Some of these items may not be just about software design but also around surrounding areas such as the engineering organization, culture etc.

Two disclaimers up front:

1) This is my opinion.
2) There will be real world situations where the below principles will be wrong as in all things "software". Please use common sense all the time.

Consider using more than one datacenter

There have been numerous horror stories about businesses, ahem going out of business because they just had a single datacenter. Its really important to have more than one data center if you want to protect yourself from natural disasters or electrical supply failures. Run all your datacenters in active-active configuration. It may cost extra money, but its well worth it rather than having an active passive configuration and then finding out at the end that for some pieces of data, your passive hardware was not consistent with the active one.

Consider a sparse datacenter deployment

Click to read more ...


Stuff The Internet Says On Scalability For December 13th, 2013

Hey, it's HighScalability time:

Test your sense of scale. Is this image of something microscopic or macroscopic? Find out.

  • 80 billion: Netflix logging events per day; 10 petabytes: data; six million: Foursquare checkins per day; 
  • Quotable Quotes:
    • George Lakoff: What can't all your thoughts be conscious? Because consciousness is linear and your brain is parallel. The linear structure of consciousness could never keep up.
    • @peakscale: "Engineers like to solve problems. If there are no problems handily available, they will create their own problems" - Scott Adams
    • @kiwipom:  “Immutability is magic pixie dust that makes distributed systems work” - Adrian Cockcroft 
    • @LachM: Netflix: SPEED at SCALE = breaks EVERYTHING. #yow13
    • Joe Landman: … you get really annoyed at the performance of grep on file IO (seriously folks? 32k or page size sized IO? What is this … 1992?) so you rewrite it in 20 minute in Perl, and increase the performance by 5-8x or so.
    • @rjrogers87: "Goldman Sacs has 36,000 employees, 6,000 are developers.  They support these folks w/half-a million cores. #GartnerDC” 
    • @KentLangley: Dear Amazon AWS, Please STOP with the aggressive reserved instances sales push. I use on-demand ON PURPOSE.

  • Good story of how moved from Google App Engine to EC2, nodejs, and mongodb. Migration decision based on: ease of development, performance and cost. GAE suffers from slow database operatations, costly over provisioning of instances, slow startup times lead to timeouts, low 1MB memcache size limit, bulk loads and exports of data a nightmare, search is slow. Like about nodejs: one programming language on client and server, portability, fast. Like from GAE: could see everything in one place with the management console, easy deployment, easy to create new code and test it out.

  • Wikipedia's Order of Magnitude page: The pressure of a human bite is about 1/9th of the atmospheric pressure on Venus. The fastest bacterium on earth is just outstripping the fastest glacier. A square meter of sunshine in the spring imparts about 1 horsepower.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...


Using Node.js PayPal Doubles RPS, Lowers Latency, with Fewer Developers, but Where Do the Improvements Really Come From?

PayPal gives yet another glowing report of an app rewritten in node.js experiencing substantial performance improvements. PayPal rewrote their account overview page, one of the most trafficked apps on the website, which was previously written in King Java.

The benefits:

  1. Full-stack engineers. Using JavaScript on both the front-end and the back-end removed an artificial boundary between the browser and server, allowing engineers to code both.
  2. ...

    Click to read more ...


Sponsored Post: Booking, Spokeo, Apple, NuoDB, ScaleOut, MongoDB, BlueStripe, AiScaler, Aerospike, New Relic, LogicMonitor, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Apple is hiring for multiple positions. Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly.
    • Quality Assurance Engineer. The iOS Systems team is looking for a Quality Assurance engineer. In this role you will be expected to work hand-in-hand with the software engineering team to find and diagnose software defects. Please apply here.
    • Sr Software Engineer iPhone. Do you love building highly scalable, distributed web applications? Does the idea of a fast-paced environment make your heart leap? Do you want your technical abilities to be challenged every day, and for your work to make a difference in the lives of millions of people? Please apply here.
    • Sr Software Engineer. The iOS Systems Team is looking for a Software Engineer to work on operations, tools development and support of worldwide iOS Device sales and activations. Please apply here

  • We need awesome people @ - We want YOU! Come design next
    generation interfaces, solve critical scalability problems, and hack on one of the largest Perl codebases. Apply:

  • Spokeo is hiring a Senior Backend Developer. We've spent years agonizing over the best way to construct an elegant, simplistic, yet highly powerful people search engine. Spokeo deals with problems of ginormous scale, so a strong understanding and appreciation for algorithms and efficiency is desired. Please apply here.

  • Spokeo is hiring a Senior Software Developer - Web Applications. Build features that involve any of our products, from the universal people search interface and functionality to the construction of family trees to a portal that connects customers to employees and employer data. Please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

  • New Relic is looking for a Java Instrumentation Engineer, Java Scalability Engineer,  Distributed Systems Engineer and Android app engineer in Portland, OR. Ready to scale a web service with more incoming bits/second than Twitter? 

Fun and Informative Events

  • Your amazing event here.

Cool Products and Services

  • LogicMonitor is the cloud-based IT performance monitoring solution that enables companies to easily and cost-effectively monitor their entire IT infrastructure stack – storage, servers, networks, applications, virtualization, and websites – from the cloud. No firewall changes needed - start monitoring in only 15 minutes utilizing customized dashboards, trending graphs & alerting

  • NuoDB Blackbirds Release 2.0 Birthday. They grow up so fast these days! What people love about NuoDB is that it’s stable, always there for you and its flexible. Which is why it’s winning all kinds of popularity competitions, from “Most Likely to Succeed” through “Least Likely To Fall Over Sharding” to “Most Likely to Be ACID Compliant”. 

  • Rapidly Develop Hadoop MapReduce Code. With ScaleOut hServer™ you can use a subset of your Hadoop data and run your MapReduce code in seconds for fast code development and you don’t need to load and manage the Hadoop software  stack, it's a self-contained Hadoop MapReduce execution environment. To learn more check out

  • MongoDB Backup Free Usage Tier Announced. We're pleased to introduce the free usage tier to MongoDB Management Service (MMS). MMS Backup provides point-in-time recovery for replica sets and consistent snapshots for sharded systems with minimal performance impact. Start backing up today at

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • Aerospike Capacity Planning Kit. Download the Capacity Planning Kit to determine your database storage capacity and node requirements. The kit includes a step-by-step Capacity Planning Guide and a Planning worksheet. Free download.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...


In Memory: Grace Hopper to Programmers: Mind Your Nanoseconds!

This is an article published last year, but as today is Grace Hopper's birthday I thought it would be a good time to share again an amazing talk from this amazing woman.

Computing pioneer Grace Hopper, inventor of the compiler, searched for a concrete way to create an intuitive understanding of just how fast is a nanosecond, a billionth of a second, which was the speed of their new computer circuits. As an illustration she settled on the length of wire that is as long as light can travel in one nanosecond. The length is a very portable 11.8 inches. A microseconds worth of wire is a still portable, but a much bulkier 984 feet. In one millisecond light travels 186 miles, which only Hercules could carry. In today's terms, at a 3.06 GHz clock speed, there's .33 nanoseconds between ticks, or 3.73 inches of light travel.

Understanding the profligate ways of programmers, she suggests that every programmer wear a necklace of a microseconds worth of wire so they know what they are wasting when they throw away microseconds. And if a General is busting your chops about satellite messages taking too long to send, you can bust out your piece of wire and explain there's a lot of nanoseconds between here and there.

Here's a short, witty, and wise video of her famous nanosecond demonstration. An amazing lady, great innovator, an engaging speaker, and an inspiring teacher.

Related Articles


Site Moves from PHP to Facebook's HipHop, Now Pages Load in .6 Seconds Instead of Five

If you code in PHP have you ever wondered about moving to Facebook's HipHop JIT Virtual Machine for PHP? With HipHop Facebook achieved over a 9x increase in web request throughput and over a 5x reduction in memory consumption compared to Zend PHP 5.2 engine + APC.

But will HipHop really work for you? Is it really drop-in compatible? Is it really as fast as they say?

To answer questions like this nothing beats a good experience report and here's a great one: Adventures in Configuring and Running Facebook's HipHopVM (hhvm) JIT Compiler for PHP by Yermo Lamers.

Yermo selected PHP to implement a number of content web sites. He took an interesting approach, he created a forms, views, validation, and business logic description language to remove the drudgery of creating the same code over and over again for each page. Having done this in Perl I think it's a great a approach. The problem is it can be slow. PHP's slow string handling makes dynamically evaluating a description template for each page very slow. Up to 5 seconds.

Rather than rewrite in C++ Yermo tried HipHop. The results were impressive:

Click to read more ...


Stuff The Internet Says On Scalability For December 6th, 2013

Hey, it's HighScalability time:

Test your sense of scale. Is this image of something microscopic or macroscopic? Find out.

  • 72: Intel's 72 core x86 Processor; One Trillion: number of fonts served by Google.
  • Quotable Quotes:
    • West-Eberhard: The gene does not lead, it follows.
    • @waldojaquith: To an ant, gravity is nothing, but surface tension is a powerful force. When you change scale, you play by different rules.
    • Nicholas Christakis: The spread of germs is the price we pay for the spread of ideas. We assemble ourselves into networks to facilitate the flow information but we pay a price, the spread of disease.
    • James Mickens: When you debug a distributed system or an OS kernel, you do it Texas-style. You gather some mean, stoic people, people who have seen things die, and you get some primitive tools, like a compass and a rucksack and a stick that’s pointed on one end, and you walk into the wilderness and you look for trouble.
    • Joe McMahon: The average small startup in Silicon Valley today – 20 or so people –  is carrying about the equivalent power of all the PDP-11′s sold during the 1970′s in their pockets and purses.
    • Ilya Grigorik:  Wow... is completely disregarding the initial TCP congestion window
    • Twitter: Every problem is a scaling problem. 
  • And so it begins. Google has opened Google Compute Engine to the masses. You can look at this by comparing features, cost, performance, etc. You can compare by ecosystem. You can compare by who is most likely to eat their young. But what is clear: developers will now be comparing.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...


How Can Batching Requests Actually Reduce Latency?

Jeremy Edberg gave a talk on Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons and one of the issues they had was that they:

Did not account for increased latency after moving to EC2. In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. Not so on EC2. Memcache access times increased 10x to a millisecond which made their old approach unusable. Fix was to batch calls to memcache so a large number of gets are in one request.

Dave Pacheco had an interesting question about batching requests and its impact on latency:

I was confused about the memcached problem after moving to the cloud. I understand why network latency may have gone from submillisecond to milliseconds, but how could you improve latency by batching requests? Shouldn't that improve efficiency, not latency, at the possible expense of latency (since some requests will wait on the client as they get batched)? 

Jeremy cleared it up by saying:

The latency didn't get better, but what happened is that instead of having to make a lot of calls to memcache it was just one (well, just a few), so while that one took longer, the total time was much less.

But Dave Rosenthal created a great graphic showing how batching can in fact decrease total system latency:



Evolution of Bazaarvoice’s Architecture to 500M Unique Users Per Month

This is a guest post written by Victor Trac, Cloud Architect at Bazaarvoice.

Bazaarvoice is a company that people interact with on a regular basis but have probably never heard of. If you read customer reviews on sites like,, or, you are using Bazaarvoice services. These sites, along with thousands of others, rely on Bazaarvoice to supply the software and technology to collect and display user conversations about products and services. All of this means that Bazaarvoice processes a lot of sentiment data on most of the products we all use daily.

Bazaarvoice helps our clients make better products by using a combination of machine learning and natural language processing to extract useful information and user sentiments from the millions of free-text reviews that go through our platform. This data gets boiled down into reports that clients can use to improve their products and services. We are also starting to look at how to show personalized sortings of reviews that speak to what we think customers care about the most. A mother browsing for cars, for example, may prefer to read reviews about safety features as compared to a 20-something male, who might want to know about the car’s performance. As more companies use Bazaarvoice technology, consumers become more informed and make better buying decisions.

Here's how it works...

Click to read more ...