« Stuff The Internet Says On Scalability For December 30, 2011 | Main | PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month »

Strategy: Guaranteed Availability Requires Reserving Instances in Specific Zones

When EC2 first started the mental model was of a magic Pez dispenser supplying an infinite stream of instances in any desired flavor. If you needed an instance, because of a either a failure or traffic spike, it would be there. As amazing as EC2 is, this model turned out to be optimistic.  

From a thread on the Amazon discussion forum we learn any dispenser has limits:

As Availability Zones grow over time, our ability to continue to expand them can become constrained. In these scenarios, we will prevent customers from launching in the constrained zone if they do not yet have existing resources in that zone. We also might remove the constrained zone entirely from the list of options for new customers. This means that occasionally, different customers will see a different number of Availability Zones in a particular Region. Both approaches aim to help customers avoid accidentally starting to build up their infrastructure in an Availability Zone where they might have less ability to expand.

The solution: if you need guaranteed resources in different zones, purchase Reserved Instances. This will assure capacity when needed. There's no way to know if the instance types you are interested in are available in an availability zone, so reserving instances is the only solution. 

Architecturally this is a pain and removes part of the win of the cloud. Having nailed up instances is nearly one step from dedicated machines in a colo. And now, if you can't count on on-demand instances, your architecture requires enough reserved machines to handle a disaster scenario, which means your fixed costs are high enough that you really need to make use of those reserved instances all the time, unless you have the money to just keep them as backup for failover. A much more complicated scenario, but I guess you have to run out of Pez eventually.

Reader Comments (3)

This is the strategy that Netflix uses to reduce costs and get higher availability during AWS outages/shortages. The key extra point to make is that the reservations should only be made for the production account, but any unused reservations count towards other accounts such as test or offline batch oriented work that isn't customer visible. When the accounts roll up to one bill at the end of the month, our "extra" reservations are used up so we don't waste them.

December 28, 2011 | Unregistered Commenteradrianco

I honestly don't understand how they ever thought that they could do without reserved machines to handle a disaster scenario. As you said yourself, you have to run out of Pez eventually.

January 4, 2012 | Unregistered CommenterAndrew

The important thing to note here is the reason Amazon removes Availability Zones from new customers or doesn't let you launch machines in Availability Zones you don't already have machines in (if they're getting full) is so existing customers still have "reasonable" resources available to launch new machines. Eric Hammond first reported this at http://alestic.com/2011/08/ec2-unavailabiilty-zones

January 6, 2012 | Unregistered CommenterSean Bannister

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>