NuoDB's First Experience: Google Compute Engine - 1.8 Million Transactions Per Second

This is a repost of the blog entry written by NuoDB's Tommy Reilly.

We at NuoDB were recently given the opportunity to kick the tires on the Google Compute Engine by our friends over at Google. You can watch the entire Google Developer Live Session by clicking here.  In order to access the capabilities of GCE we decided to run the same YCSB based benchmark we ran at our General Availability Launch back in January. For those of you who missed it we demonstrated running the YCSB benchmark on a 24 machine cluster running on our private cloud in the NuoDB datacenter. The salient results were 1.7 million transactions per second with sub-millisecond latencies.

Public cloud environments typically mean virtualization, inconsistent network performance and potentially slow or low bandwidth disk access. It just so happens that NuoDB was designed to work well in such harsh environments (we don’t call it a cloud database for nothing). Still, the faster the CPU, network and disk the faster the database can operate (thank you captain obvious!). But before we dig into the numbers lets talk about what its like to run NuoDB in the GCE.

Pretty much everything you’d want to do in the GCE is done with a tool called gcutil. First you authenticate with gcutil auth, and through the magic of OAuth your GMail account is used to generate a password that you can cut & paste into gcutil. Then you can say goodbye to dealing with access control. gcutil remembers who you are and sets up the necessary ssh credentials, so that you can log into any of your instances automatically from that machine. And better yet, when you ssh to an instance you haven’t logged into before, a home directory is created for you automatically with your ssh credentials installed so you can then ssh from that machine into any of your other instances. In some ways GCE is easier to deal with than our private cloud (but don’t tell our sys admin I said that).

Another thing you won’t have to deal with is IP addresses, gcutil knows your instance names so instead of having to go hunting for IP addresses you just do:

gcutil ssh image_name

It’s that easy. The gcutil command does all the things you’d expect: you can add, list, delete things called instances, images, firewalls, networks and disks. For those who like to see real code here’s a snippet of our script to create a NuoDB GCE image:

gcutil addinstance --machine_type=n1-highmem-8 \ --image=projects/google/global/images/gcel-12-04-v20130225 \ --service_account_scopes=https://www.googleapis.com/auth/devstorage.full_control gcutil push nuodb-stage nuodb-1.0.1.linux.x64.deb /tmp/ gcutil ssh nuodb-stage << EOF sudo apt-get -y install openjdk-7-jre-headless sudo dpkg -i /tmp/nuodb-1.0.1.linux.x64.deb sudo python /usr/share/imagebundle/image_bundle.py -r / -o /tmp --output_file_name=nuodb-1.0.1.image.tar.gz gsutil cp /tmp/nuodb-1.0.1.image.tar.gz gs://nuodb_images/ EOF gcutil addimage --preferred_kernel=projects/google/global/kernels/gce-v20130225 nuodb-101 gs://nuodb_images/nuodb-1.0.1.image.tar.gz

We’re working on getting a pre-built image public so that last line works for everyone. In the meantime, check out our Developer Download Page for details on how to get the product. Doing an install is really easy!

What this does is create an instance, push (scp under the covers) the NuoDB .deb to it (GCE supports a couple different ubuntu and centos based distros), install the JRE and then install nuodb. Then the image is saved for reuse by bundling the raw disk image and uploading it to Google Storage (this is why there’s that –servce_account_scopes argument to addinstance is there, it allows me to use gsutil in my instance without having to authenticate). For more details see the great gcutil docs.

After getting an image together we staged our benchmark by creating 32 images. The domain has to be formed manually by adding the following lines to instances 2-32 default.properties:

broker = false agent = nuodb-1

We’re not supposed to talk about unimplemented features but in the future my spidey sense tells me this step will go away, most likely by exploiting the GCE metadata mechanism.

From there we exploited GCE’s shared read only disk feature to get our benchmark deployed to every instance and setup 1 Storage Manager (with two disks attached for atom storage and the journal) and 32 Transaction Engines.

So without further ado we can report that NuoDB and GCE achieved 1.8 million transactions per second using our multi-client YCSB setup on workload b (5% updates) on 32 nodes. Read and update latencies were in the 1 millisecond range. We did not run the test with journaling enabled because we’re still working on getting high IO performance against a GCE virtual disk, but we’ll publish those numbers soon. In the meantime let us know what your database needs are like (workload mix, record sizes, db size, etc), so we can run more benchmarks covering the right use-cases.

These results demonstrate clearly that NuoDB and GCE work very well together and we’re excited about adding further support for GCE in future releases.