« The cost of High Availability (HA) with Oracle | Main | Hot Scalability Links for April 16, 2010 »

Strategy: Order Two Mediums Instead of Two Smalls and the EC2 Buffet

Vaibhav Puranik in Web serving in the cloud – our experiences with nginx and instance sizes describes their experience trying to maximum traffic and minimum their web serving costs on EC2. Initially they tested with two m1.small instance types and then they the switched to two c1.mediums instance types. The m1s are the standard instance types and the c1s are the high CPU instance types. Obviously the mediums have greater capability, but the cost difference was interesting:

  • In the long term they will save money using the larger instances and not autoscaling. With the small instances, traffic bursts caused autoscaling to kick in. New instances were started in response to load. The instances woud be up for a short period of time and then spin down again. This constant churn costs a lot of money. Selecting the larger instance sizes, which are capable of handling the load without autoscaling, turn out to save money even though they are more expensive. Starting new instances also takes a few minutes  and they don't want to lose that traffic.

This went against my intuition, so I thought it might be a useful test for others to try.

Related Articles

Reader Comments (10)

This is true. I end up using 2 XL instances instead of 2 small with auto scaling and lowered the cost significantly.

April 19, 2010 | Unregistered Commenterleonidas tsementzis

What a surprising result! This strategy is so clever!
Buying more expensive plans with more CPU power makes your CPU resource problems go away. Film at 11.

It is not about buying bigger instances instead of small ones with autoscaling. It's about buying the instances which have more CPU power but less RAM than the general ones with average CPU and memory when having CPU intensive tasks.
And it's a no-brainer.

Sorry for being sarcastic but more and more people publish their "findings" when really most of the time it's a trivial decision.

April 19, 2010 | Unregistered Commenterfrost

Don't be so cold frost. I personally found it interesting that the overhead of starting instances to handle load overwhelms the expense of a significant upgrade in instance type That's not what I would have expected. I would think you go for two smalls for availability and because it's cheaper, then bring on more resources as needed. That this doesn't work may be a deliberate feature of Amazon's pricing strategy. In any case it's a great example o howf costs effect architecture.

April 19, 2010 | Registered CommenterTodd Hoff

this is a rather strange results.
It sounds like their load was exactly at the threshold of when new instance should be spun up. perhaps adjusting autoscaling parameters so it doesn't constantly start/stop them would improve things?
perhaps 3 small ones would be better than 2 mediums?

April 19, 2010 | Registered Commentermxx

Todd: I think what you are overlooking here is that the type of instance he upgraded to was the not same as the small one.
The small one was a default 1:1 CPU:RAM instance. Average CPU and average RAM.
But the type he upgraded to was a CPU centric instance with more CPU but not more RAM.

So 2 m1.small instances have 2 compute units (one each). Bringing two more m1.small instances up at peak time makes this 4 compute units.
Now one c1.medium has 5 compute units. More than he had when two more m1.small were spawning on peak.
And he bought two c1.medium so 10 compute units in total. He would need 10 m1.small for that.

It's an unfair comparison and hence a wrong conclusion.

The notion that bringing up new small nodes up instead of having one big running all the time is just false because if you go the same instance type up, you will see that for example a m1.large gives you 4x as much compute power than a m1.small at the price of 4 m1.small. So there is no difference in terms of CPU power between 4 small ones and one large.
If you put your latency threshold low enough to scale up in time, it should not be a problem.

Now if he could show that running a c1.xlarge all the time is better than having one or two c1.medium autoscale, that would be something noteworthy. But that's simply not the case! It would lead the whole cloud autoscaling idea ad absurdum.

Hope this makes clear why I think this post leads to wrong conclusions and/or decisions.

Leonidas: can you give an example calculation? I don't see how this is possible.

April 19, 2010 | Unregistered Commenterfrost

I concur with frost here, it seems premature to conclude anything. Their problem would likely be due to a non-optimized scaling parameter.

- Sebastian

April 19, 2010 | Unregistered CommenterSebastian Stadil

A lot of good points frost. And I'm not quite sure what kind scaling parameters Sebastian is talking about. But consider why you may move to a larger instance size: to solve a problem too big for a smaller instance or to handle more load of work that could be handled by a smaller instance. In this case the smaller instance is sufficient to carry out the work so a larger instance is only needed to handle more load. The cost of starting and using a the smaller instance for a smaller period of than using a larger full time instance is significant. So it's more expensive because of the overhead involved in the instance setup/teardown. I found this surprising and it dictates architecture choices.

Compare to Google App Engine which is a sort degenerate form of this approach where GAE transparently starts applications in response to load somewhere in a cluster. If the app isn't already warm it takes quite a while to spin up, yet the GAE cost of spinning up applications doesn't seem to have the same spikey nature as Amazon.

April 19, 2010 | Registered CommenterTodd Hoff

Todd: no the cost of bringing up a bigger node is not a problem if you set your scaling threshold low enough to start it soon enough and not wait until 5min before disaster. And that's the parameter Sebastian is refering to.
They waited for the latency to hit 0.75 seconds. That's already pretty damn bad of a response time.
In my projects every response taking longer than 100 milliseconds is a warning sign. Usually a response should be calculated within less than 10ms.

Now if you set your scaling parameter to lets say 100ms, then you will get new instances soon enough to not have a bottleneck situation and you will be able to handle the peak traffic.
Don't be afraid to set the threshold too low, it only means you will have more of the smaller instances running more often. But if it costs the same as one bigger instance and gives the same amount of computing power then you don't lose anything by having them running compared to having the big one running all the time.
You can only save money by having fewer instances running at low traffic periouds.
And that is the whole point of virtualization/clouds/whateverfancyword.

So to summarize:

"The cost of starting and using a the smaller instance for a smaller period of than using a larger full time instance is significant. So it's more expensive because of the overhead involved in the instance setup/teardown."

... is simply wrong if you set your scaling threshold correctly. There is no "setup" cost of bringing a new instance up. It is *not* more expensive!

April 20, 2010 | Unregistered Commenterfrost

Todd: by the way, I'd love to see the article corrected, I think that's needed if you want to keep up the quality

April 21, 2010 | Unregistered Commenterfrost

If this becomes accepted as a general truth by means of experience over time, it would seem that the best route would be to rebaseline the sizing models.

May 3, 2010 | Unregistered CommenterJohnW

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>