« Stuff The Internet Says On Scalability For May 20, 2011 | Main | Facebook: An Example Canonical Architecture for Scaling Billions of Messages »

Zynga's Z Cloud - Scale Fast or Fail Fast by Merging Private and Public Clouds

Release early and often. A/B testing. Creating a landing page and buying ads on AdSense. All are ways of providing quick feedback in order to validate an idea. If you are like Zynga, with 250 million active users a month, how do you cost effectively prove out a game that could flop or get 90 million users (like CityVille) in an instant?

Zynga handles this problem inlle an innovative way, by inverting the typical cloud burst scenario that has excess traffic flowing from a datacenter to a cloud, to having a game start in the cloud and then moving to the datacenter once the game has proved popular enough to keep.

This process is nicely described by Charles Babcock in Lessons From FarmVille: How Zynga Uses The Cloud, in an interview with Allan Leinwand, CTO of infrastructure engineering at Zynga.

When paired down to its essence, Zynga's strategy goes something like this:

  • Games are risky.  Even with all their experience, Zynga can't know how many people will play a game or how fast the adoption rate will be. This makes capacity planning more like throwing a dart at a distant dart board, while blind folded, drunk, after having been spun around three times. Spending on a large infrastructure build out while in this condition could be costly. 90 million users have came from a new game, CityVille. Did they know CityVille would be so successful? With enough certainty that they would buy equipment for 90 million users? And for a spiking growth curve could they even requisition infrastructure fast enough to respond to demand?
  • Zynga mitigates this risk by launching games using Amazon's EC2. While EC2 is more expensive than Zynga's own datacenter, it's less expensive than doing a large infrastructure build out that sits unused, either because a game isn't popular or a game has a long ramp up time. Amazon is used as a small scale experimental apparatus that allows a mad scientist to pay only for the capacity used, yet it can still handle growth if a game becomes successful.
  • After a game matures it is brought in house to their private Z Cloud. Zynga leases datacenter space on the West and East coasts. They've built something they call the Z Cloud as their game infrastructure. When they can plot the slope of the growth curve for a game they can start capacity planning and buying datacenter space. The game is then moved into their Z Cloud.
  • The public and private cloud are one. Even when a game is moved into their private Z Cloud, parts of the game may reside on Amazon. The key is Zynga sees this as a hybrid environment. It's not a public cloud, or a private cloud, but one system to architect and manage. The strengths of each system are taken advantage of based on product lifecyle, costs, and feature requirements. 
  • Like attracts like. The Z Cloud works something like Amazon internally, so it's not a disruptive jump between the clouds. Like Amazon their cloud is based on virtualization and automation. Thousands of physical servers can easily be deployed in a day. Servers are highly standardized to make this process simpler. 

Zynga is as ever tight lipped about the details of their infrastructure, but that doesn't really matter here, the idea of this easy bidirectional flow between clouds is a powerful one. Also, Zynga's proving ground approach doesn't preclude cloud bursting when appropriate. The flexibility for managing costs and risks is enormous. If only there was a cloud platform that could make this easier. Oh wait, there's...OpenStack.

Experiment Like MythBusters

I love the MythBusters as an example of going from small scale to a large scale experiments. I talk about them a bit in What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications:

Related Articles

Reader Comments (2)

Zynga's strategy might seem to contradict the usual lifecycle, but it is guided by the same principles as "classic" cloud-bursting: "own the base, rent the spike". In Zynga's case, the base is unknown until the game has been in the field for a while. In other words, everything until the capacity planning has occurred is considered to be a spike.

May 22, 2011 | Unregistered CommenterShlomo Swidler

So now we know that Zynga was using CloudStack all along....

Zynga created one of the most famous hybrid clouds in existence. Although there aren't really that many of them in existence and certainly not many at this scale. Zynga seems to be a real innovator when it comes to cloud computing and scalability. Zynga hybrid cloud is called Z Cloud. Zynga uses CloudStack (Cloud.com), and RightScale. If their instances are at capacity they can also spin up servers running on Amazon WS (EC2) to take some of the load. Zynga can save money by hosting apps on their own infrastructure, find a baseline and buy to that baseline, but if/when games go viral and they need more capacity, they can move servers to Amazon EC2 to handle the extra load. RightScale provides the single pane of glass interface for both Zynga’s public EC2 resources and private CloudStack (Cloud.com) resources. This is possible because of CloudStack's CloudBridge that supports Amazon WS REST interfaces.

CloudStack vs. OpenStack: Real World versus Emerging

What are your thoughts on CloudStack veruss OpenStack?

November 25, 2011 | Unregistered CommenterBill Digman

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>