End-To-End Performance Study of Cloud Services

Cloud computing promises a number of advantages for the deployment of data-intensive applications. Most prominently, these include reducing cost with a pay-as-you-go pricing model and (virtually) unlimited throughput by adding servers if the workload increases. At the Systems Group, ETH Zurich, we did an extensive end-to-end performance study to compare the major cloud offerings regarding their ability to fulfill these promises and their implied cost.

The focus of the work is on transaction processing (i.e., read and update work-loads), rather than analytics workloads. We used the TPC-W, a standardized benchmark simulating a Web-shop, as the baseline for our comparison. The TPC-W defines that users are simulated through emulated browsers (EB) and issue page requests, called web-interactions (WI), against the system. As a major modification to the benchmark, we constantly increase the load from 1 to 9000 simultaneous users to measure the scalability and cost variance of the system.  Figure 1 shows an overview of the different combinations of services we tested in the benchmark.

SUT
Figure 1: Systems Under Test

The main results are shown in Figure 2 and Table 1 - 2 and are surprising in several ways. Most importantly, it seems that all major vendors have adopted a different architecture for their cloud services (e.g., master-slave replication, partitioning, distributed control and various combinations of it). As a result, the cost and performance of the services vary significantly depending on the workload. A detailed description of the architectures is provided in the paper. Furthermore, only two architectures, the one implemented on top of Amazon S3 and MS Azure using SQL Azure as the database, were able to scale and sustain our maximum workload of 9000 EBs, resulting in over 1200 Web-interactions per second (WIPS).  MySQL installed on EC2 and Amazon RDS are able to sustain a maximum load of approximate 3500 EBs. MySQL Replication performed similar to MySQL standalone with EBS, so we left it off the picture. Figure 1 shows that the WIPS of Amazon’s SimpleDB grow up to about 3000 EBs and more than 200 WIPS. In fact, SimpleDB was already overloaded at about 1000 EBs and 128 WIPS in our experiments. At this point, all write requests to hot spots failed. Google AppEngine already dropped out at 500 emulated browsers with 49 WIPS. This is mainly due to Google’s transaction model not being built for such high write workloads. When implementing the benchmark, our policy was to always use the highest offered consistency guarantees, which come closest to the TPC-W requirements. Thus, in the case of AppEngine, we used the offered transaction model inside an entity group. However, it turned out, that this is a big slow-down for the whole performance. We are now in the process of re-running the experiment without transaction guarantees and curios about the new performance results.

Scalability
Figure 2: Comparison of Architectures [WIPS]

Table 1 shows the total cost per web-interaction in milli dollar for the alternative approaches and a varying load (EBs). Google AE is cheapest for low workloads (below 100 EBs) whereas Azure is cheapest for medium to large workloads (more than 100 EBs).  The three MySQL variants (MySQL, MySQL/R, and RDS) have (almost) the same cost as Azure for medium workloads (EB=100 and EB=3000), but they are not able to sustain large workloads.

CostWIPS
Table 1: Cost per WI [m$], Vary EB

The success of Google AE for small loads has two reasons.  First, Google AE is the only variant that has no fixed costs. There is only a negligible monthly fee to store the database. Second, at the time these experiments were carried out, Google gave a quota of six CPU hours per day for free.  That is, applications which are below or slightly above this daily quota are particularly cheap.

Azure and the MySQL variants win for medium and large workloads because all these approaches can amortize their fixed cost for these workloads. Azure SQL server has a fixed cost per month of USD 100 for a database of up to 10 GB, independent of the number of requests that need to be processed by the database.  For MySQL and MySQL/R, EC2 instances must be rented in order to keep the database online.  Likewise, RDS involves an hourly fixed fee so that the cost per WIPS decreases in a load situation.  It should be noted that network traffic is cheaper with Google than with both Amazon and Microsoft.

Table 2 shows the total cost per day for the alternative approaches and a varying load (EBs). (A "-" indicates that the variant was not able to sustain the load.)  These results confirm the observations made previously:  Google wins for small workloads;  Azure wins for medium and large workloads.  All the other variants are somewhere in between.  The three MySQL variants come close to Azure in the range of workloads that they sustain. Azure and the three MySQL variants roughly share the same architectural principles (replication with master copy architectures). SimpleDB is an outlier in this experiment. With the current pricing scheme, SimpleDB is an exceptionally expensive service.  For a large number of EBs, the high cost of SimpleDB is particularly annoying because users must pay even though SimpleDB drops many requests and is not able to sustain the workload.

CostDay
Table 2: Total Cost per Day [$], Vary EB

Turning to the S3 cost in Table 2, the total cost grows linearly with the workload.  This behavior is exactly what one would expect from a pay-as-you-go model.  For S3, the high cost is matched by high throughputs so that the high cost for S3 at high workloads is tolerable. This observation is in line with a good Cost/WI metric for S3 and high workloads  (Table 1). Nevertheless, S3 is indeed more expensive than all the other approaches (except for SimpleDB) for most workloads.  This phenomenon can be explained by Amazon's pricing model for EBS and S3. For instance, a write operation to S3 is hundred times more expensive than a write operation to EBS which is used in the MySQL variant.  Amazon can justify this difference because S3 supports concurrent updates with an eventual consistency policy whereas EBS only supports a single writer (and reader) at a time.

In addition to the here presented results, the paper also compares the overload behavior and presents the different cost-factors leading to the here presented numbers. If you are interested in these results and additional information about the test-setup, the paper will be presented at this year's SIGMOD conference and can also be downloaded here.