« Is Eucalyptus ready to be your private cloud? | Main | The Future of the Parallelism and its Challenges »

Scaling PostgreSQL using CUDA

Combining GPU power with PostgreSQL

PostgreSQL is one of the world's leading Open Source databases and it provides enormous flexibility as well as extensibility. One of the key features of PostgreSQL is that users can define their own procedures and functions in basically any known programming language. With the means of functions it is possible to write basically any server side codes easily.

Now, all this extensibility is basically not new. What does it all have to do with scaling and then? Well, imagine a world where the data in your database and enormous computing power are tightly integrated. Imagine a world where data inside your database has direct access to hundreds of FPUs. Welcome to the world of CUDA, NVIDIA's way of making the power of graphics cards available to normal, high-performance applications.

When it comes to complex computations databases might very well turn out to be a bottleneck. Depending on your application it might easily happen that adding more CPU power does not improve the overall performance of your system – the reason for that is simply that bringing data from your database to those units which actually do the computations is ways too slow (maybe because of remote calls and so on). Especially when data is flowing over a network, copying a lot of data might be limited by network latency or simply bandwidth. What if this bottleneck could be avoided?

CUDA is C / C++

Basically a CUDA program is simple a C program with some small extensions. The CUDA subsystem transforms your CUDA program to normal C code which can then be compiled and linked nicely with existing code. This also means that CUDA code can basically be used to work inside a PostgreSQL stored procedure easily. The advantages of this mechanism are obvious:

GPUs can do matrix and FPU related operations hundreds of times faster than any CPU
the GPU is used inside the database and thus no data has to be transported over slow lines
basically any NVIDIA graphics card can be used
you get enormous computing power for virtually zero cost
you can even build functional indexes on top of CUDA stored procedures
not so many boxes are needed because one box is ways faster

How to make it work?

How to make this all work now? The goal for this simplistic example is to generate a set of random number on the CPU, copy it to the GPU and make the code callable from PostgreSQL.

Here is the function to generate random numbers and to copy them to the GPU:

/* implement random generator and copy to CUDA */


generate_random_numbers(int number_of_values)


nn_precision *cuda_float_p;

/* allocate host memory and CUDA memory */

nn_precision *host_p = (nn_precision *)pg_palloc(sizeof(nn_precision) * number_of_values);

CUDATOOLS_SAFE_CALL( cudaMalloc( (void**) &cuda_float_p,

sizeof(nn_precision) * number_of_values));

/* create random numbers */

for (int i = 0; i < number_of_values; i++)


host_p[i] = (nn_precision) drand48();


/* copy data to CUDA and return pointer to CUDA structure */

CUDATOOLS_SAFE_CALL( cudaMemcpy(cuda_float_p, host_p,

sizeof(nn_precision) * number_of_values, cudaMemcpyHostToDevice) );

return cuda_float_p;


Now we can go and call this function from a PostgreSQL stored procedure:

/* import postgres internal stuff */

#include "postgres.h"

#include "fmgr.h"

#include "funcapi.h"

#include "utils/memutils.h"

#include "utils/elog.h"

#include "cuda_tools.h"


/* prototypes to silence compiler */

extern Datum test_random(PG_FUNCTION_ARGS);

/* define function to allocate N random values (0 - 1.0) and put it into the CUDA device */





int number = PG_GETARG_INT32(0);

nn_precision *p = generate_random_numbers(number);




This code then now be nicely compiled just like any other PostgreSQL C extension. The test random function can be called just like this:

SELECT test_random(1000);

Of course this is a just brief introduction to see how things can practically be done. A more realistic application will need more thinking and can be integrated into the database even more closely.

More information:
Professional CUDA programming
Professional PostgreSQL services
The official PostgreSQL Website
The official CUDA site

Reader Comments (6)

This is really interesting. I wonder how this can be used to speed up pgsql databases with TBs of data. I wonder if the Hadoop project has talked about integrating with CUDA to improve performance?

November 29, 1990 | Unregistered CommenterSteve C.

This looks really cool! I wonder how big the real-life effect will be though. Our apps are generally I/O and not CPU bound...

November 29, 1990 | Unregistered CommenterJan

Unless the Hadoop jobs are prepping huge data sets for post-processing by a GPU cluster, most of the time the Hadoop jobs will be I/O bound.

In principle this sounds like a great idea but IRL I'm not so sure it will be useful.

November 29, 1990 | Unregistered CommenterAaron deMello

hey man, your blog should indent code examples :D
it is hard to read. and colours could be nice too (but for me, indent is more important than fancy colours)

otherwise nice examples

November 29, 1990 | Unregistered CommenterAnonymous

CUDA is pretty much dead. OpenCL will replace CUDA and is cross vendor even though Nvidia does command GPGPU market pretty much.

It is probably better to think about implementing Database GPU Acceleration in OpenCL rather than CUDA for future code reuse

November 29, 1990 | Unregistered CommenterTS

We tried to implement a continuous map reduce framework on CUDA. Applications like these can use this framework to take advantage of the CUDA architecture without learning CUDA.


November 29, 1990 | Unregistered CommenterAnonymous

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>