nosql

SQL + NoSQL = Yes !

High Scalability

21 Dec 2010 — 11 min read

This is a guest post by Frédéric Faure (architect at Ysance), you can follow him on twitter.

Data storage has always been one of the most difficult problems to address, especially as the quantity of stored data is constantly increasing. This is not simply due to the growing numbers of people regularly using the Internet, particularly with all the social networks, games and gizmos now available. Companies are also amassing more and more meticulous information relevant to their business, in order to optimize productivity and ROI (Return On Investment). I find the positioning of SQL and NoSQL (Not Only SQL) as opposites rather a shame: it’s true that the marketing wave of NoSQL has enabled the renewed promotion of a system that’s been around for quite a while, but which was only rarely considered in most cases, as after all, everything could be fitted into the « good old SQL model ». The reverse trend of wanting to make everything fit the NoSQL model is not very profitable either.

So, what’s new … and what isn’t?
A so-called SQL database is a structured relational storage system:

“structured” means that an ensemble of attributes (columns) which will contain data (values) will correspond to a single or composite key.
“relational” means that the key of a table can itself be the value of one of the attributes. Attributes in another table correspond to this key. A relationship is therefore established between these two tables and repetition of the same information in all the lines of the table is avoided. Functional features are structured and modeled in the database…

NoSQL is a structured database which enables access to stored data via a simple key. It’s somewhat like an extreme version of a de-standardized model (which is mentioned in SQL, by purposely repeating information in a table in order to limit the number of relationships to be taken into account in a request and therefore optimize the response time – as long as the act of repeating the information doesn’t overly increase the size of a line and therefore of the table).

Remember too that in NoSQL bases, the notion of indexing is not taken into account (hashtable’s mechanism) and that it is not possible to make conditional requests (with WHERE clauses): the values are recovered by the key and that’s it! There are some exceptions, such as Tokyo Cabinet’s API « Table », MongoDB, or even AWS’ SimpleDB,… which do enable such requests and deal with indexing… But pay attention to performance in those cases.

NoSQL OR/AND SQL?
As I said in the introduction, the Not Only SQL or quite simply key-value(s) bases are not a new idea.

All that was needed to re-launch the recipe was to find a great marketing name. It’s a little like the POJOs (Plain Old Java Objects). This acronym neatly refers to the ease of use of a Java Object, which does not implement a framework’s specific interface. Since this term was launched, there’s been a return to taking pleasure in the simple things in life (well, perhaps not all the time ;ob), a bit like it is with REST… It just needed good marketing.

Beware of the exaggerations of NoSQL salesmen who are riding the wave!

Each data-storage family has its use. I will return to the example I gave in a previous article concerning an architecture used in the casual gaming industry, and which uses both systems:

Storage of what could be called the meta-model, which is not intended for data-sharding (yes, the relational model makes it difficult to shard data on several servers), based on a structured, indexed relational base, containing general information about each user (with the following information in the case of a social application: name, top score, previous score, etc.). It is therefore possible to execute SQL-type conditional requests and thus recover information or compile statistics by using WHERE clauses.
Storage of more volatile data (such as gaming data, which implies a strong write:read ratio) and for which you much choose a real, structured, non-relational storage solution of the key-value(s) type, thus enabling easy data-sharding on X servers. Be careful in this case, as it is the ease of data-sharding in particular which encourages you to use this model, rather than the high write:read ratio. The write performances will be unitarily better with a NoSQL tool (as opposed to a SQL tool with an equivalent data set on a key-value(s) model), but it doesn’t do everything, as we will see later.

I’m deliberately not talking about cache systems at this point, even if the cache would be genuinely useful in this case for the elements which tend to be accessed in read, therefore for the SQL part. I will come back to the notion of cache at the end of the article.

It is clear that not all cases are suitable candidates for using key-value(s) storage systems, such as a company’s IT department which uses databases integrated into business workflows with a great many EAI and ETL processes. If it doesn’t have relational databases with integrated business / functional notions (object modelisation), I wouldn’t even attempt to touch it with a barge pole.

On the other hand, it is possible to operate a « key-value(s) » usage from relational databases (think SQL). However, can’t we opt for a NoSQL base which will enable us to tackle a simple problem (one key – x values) without getting into such considerations as tablespaces, indexing and other parameters, sometimes rather complex?

It’s precisely the need for functional features (you know… functional specifications’ documents, which are drawn up in response to a study of the clients’ requirements, and which tend to disappear in the wake of agile methodologies simplified in the extreme) which leads us to choose one or the other data-storage family.

And what about performance?

NoSQL? NoRead (Not Only Read)!
Most of the cases presented which emphasise « out of the box » NoSQL tools are based on cases with a low write:read ratio. That is, applications where high volumes of sharded data on multiple servers are made available to users in read, with few updates.

I recently read a very interesting blog post: Using MySQL as a NoSQL – A story for exceeding 750,000 qps on a commodity server. The author, working for a casual gaming company, DeNA, explains how they worked on MySQL by perfecting a very handy plugin called HandlerSocket:

HandlerSocket is a MySQL daemon plugin so that applications can use MySQL like NoSQL

Thus, the InnoDB engine can be requested in two different ways: by using the standard MySQL layer and thus carrying out complex requests, or else by bypassing the layer in order to execute recurring requests on indexed columns, doing away with SQL Parsing, Open Table, Query Plan and Close Table, not to mention security:

Like other NoSQL databases, HandlerSocket does not provide any security feature. HandlerSocket’s worker threads run with system user privileges, so applications can access to all tables through HandlerSocket protocols. Of course you can use firewalls to filter packets, like other NoSQL products.

In the end, you obtain a MySQL, or rather an InnoDB engine, which can respond just as well in standard (SQL) as a true NoSQL database. The performances presented here are very interesting (even a bonus, compared to a Memcached). This enables Memcached to be deleted and you therefore no longer have to manage the problem of data inconsistency between Memcached and the MySQL database, and you can make the best use of the RAM by only using the InnoDB buffer pool. Moreoever, you can also make the most of the quality of the InnoDB engine, and in the event of a crash, you don’t have to worry about volatile data. Even by playing around with the innodb_flush_log_at_trx_commit parameter to improve the write performances, a failure can still be acceptable.

But it is not a miracle solution! The limitation which caught my eye is the following:

No benefit for HDD bound workloads
For HDD i/o bound workloads, a database instance can not execute thousands of queries per second, which normally results in only 1-10% CPU usage. In such cases, SQL execution layer does not become bottleneck, so there is no benefit to use HandlerSocket. We use HandlerSocket on servers that almost all data fit in memory.

This means, therefore, that if the data don’t all (or nearly all) fit into the memory and that you must therefore create a reasonable number of disk accesses (read, or, even worse… write!) … It just doesn’t work, as for any kind of key-value(s) system, NoSQL or Not Only SQL. Take a look at a Redis which puts everything into memory and regularly dumps onto the disk, or even other key-value(s) tools such as Tokyo Tyrant / Tokyo Cabinet, which operate very much in memory.

In any case it no longer works « out of the box » and as for « pure player » (no, I’m not making fun of the marketing, I’m just teasing), well get ready to stay up all night! :o)

Moreover, in these cases, given that data remains relational, even if potentially accessible in NoSQL mode, it cannot be easily sharded among several servers.

There’s no real secret to it, as long as there are plenty of disks IOs (which you were not reading from the RAM, or, particularly that you were writing – because of the necessity of synchronizing the writes via mutex or semaphores), you will have to fine-tune the following at all levels:

at the network level (this point is not directly linked to disk IOs, but it is a potential bottleneck in general, including with the use of data in RAM): recycling of connexions, buffer management etc., in the /etc/sysctl.conf. You know the things in net.ipv4.tcp_fin_timeout, net.ipv4.tcp_tw_recycle, etc.
at the filesystem level (select a type: xfs, ext4, etc.) and mount options (noatime, nodiratime, etc.),
at the scheduler level: noop, cfq, etc.
remember to increase file descriptors (cf. ulimit),
etc.

And of course, don’t forget to properly configure the NoSQL database itself. It is usually delivered with a user guide (more or less effective, depending on the maturity of the tool) which explains the tuning which is not as stripped down as you might think. So configure the tool itself by following the supplied recommendations and by adding the fruits of your own experience: number of threads in the connexion pool, master/slave management, etc.

Don’t forget your bandwith either. Well … you will have to tinker with the engine. And once you have optimized all the levels, you’ll have to think of sharding again.

Don’t underestimate the importance of the functional analysis of the data that you will put in the NoSQL database either. Well-structured and well-researched data will economize your resources (especially the network resources). Functional specifications are often a major focus of optimization.

I already encountered these problems when using Tokyo Tyrant / Tokyo Cabinet with a high write:read ratio. Ultimately, the tool is very efficient on reads, manage a lot of things in memory, even in the case where you choose disk storage (you can choose an on-memory storage, like a Memcached), however, when it comes to concurrent and intensive writes, there are limits (acknowledged by the creator of the tool himself, who produces concurrently an optimized version in this sense: Kyoto Cabinet – storage API – and Kyoto Tycoon – network interface). I haven’t looked at how the Kyoto Team works, but the Tokyo One writes with only a single thread at a time, for synchronization problems. The previously mentioned aspects should therefore be optimized to obtain the best results.

For further information on Tokyo Tyrant / Tokyo Cabinet, you can refer to the following article: Tokyo Tyrant / Tokyo Cabinet, un key-value store à la Japonaise (in French). Think lightweight, think Lua! :o)

And … what about performance?
To summarize the previous point, it is simple to observe that a key-value(s)-type database will provide superior performances because it has fewer functionalities and therefore fewer stages to satisfy: authentication for example, which is not managed in key-value(s) databases, or even SQL Parsing, Open Table, Query Plan and Close Table management, which are eliminated. No need for managing those hard-to-reconstruct indexes either. In general, a key-value(s) database is based on a HashTable-type storage mechanism, whose complexity function is O(1), a function representing the complexity of access to the data as a function of the number of occurrences N stored in the system: for example, for a B+tree the complexity function is O(log N). Therefore O(1) represents a constant access time, no matter what the number of occurrences N is. It is obvious that, based on this principle, the NoSQL base will perform better because it will restrict itself to the specific task attributed to it: take key, send back value(s).

However, it must be remembered that, whatever the database family used, you won’t go any faster than the system! There is no miracle solution! The RAM is faster than the disk and you won’t write faster than the filesystem… what a surprise! Hey! Maybe, you could try to directly access the raw device, or even better: have a filesystem based on a key-value(s) system… What? The filesystem idea has already been taken? Well ok… Take a look at this article: Pomegranate – Storing Billions And Billions Of Tiny Little Files.

And what about Memcached?!
A cache will always be useful, no matter what type of database you’re using (SQL or NoSQL) and will always be used in the case of data accessed essentially in read.

NoSQL databases often work in memory and even offer full memory modes (like a cache but without TTL and with the possibility of replication on a slave which writes to disk). The SQL databases also manage their request caches in RAM, like InnoDB Buffer Pool of the InnoDB MySQL engine. However, even with a cache that is sufficiently dimensioned, the requests still cost them processing time (CPU), bandwith, etc.

The purpose of a pure cache (with TTL hit management) in front will be to relieve the database, no matter which kind, of the burden of any read possible, in order to let it concentrate on the write requests or sending back the more dynamic/volatile data in read.

So, a cache can be placed equally in front of an SQL or a NoSQL base in order to economize resources. They are however more often found in front of SQL-type databases, because, in this case, as well as saving resources, the cache based on the model of a NoSQL database enables improved performance (even if the whole dataset fits into the SQL server’s RAM) because you get rid of authentication, parsing of the SQL request, etc.

Conclusion
There are several elements to bear in mind from this comparison of the two models:

NoSQL databases are in general faster than SQL, as they implement only the mechanisms that they need and leave out all that’s necessary for an SQL model to function (we must thereby accept a certain number of functional limits, particularly with regard to conditional requests with nice WHERE clauses).
Special mention, while we’re on the subject, goes to the security of NoSQL databases where, in most cases, it is assured by network filtering and there is no authentication management.
As the key-value(s) model is not relational, it enables easy data-sharding among several servers. It is possible to use an SQL-type database on a « key-value(s) » model.
Whether SQL or NoSQL is the selected tool, if the dataset fits in memory, performance will be much enhanced.
As soon as disk accesses become necessary, whether because the dataset in read does not fit in memory or because there is a high number of writes, performance will be poorer, and a bottleneck will occur at the OS/hardware level via the disk IOs.
Configuration of the OS is indispensable for optimizing the disk IOs and the network IOs aspects.
Don’t underestimate the importance of the functional specifications of the data that you will load into the NoSQL database, either. Well-structured and well-researched data will economize your resources (particularly network resources). These specifications are often a major focus of optimization.
Data integrity: consider the reliability of the SQL database engines which have been tested for quite some time now. There is not yet much feedback available regarding the reliability of the NoSQL engines (which does not mean that it is not satisfactory, but experience counts in this area).
In the HandlerSocket example, the point is above all to unify the data cache in a single location (the InnoDB buffer pool) in order to ensure the consistency of the data without necessarily compromising on the performance provided by a Memcached.
There is an interesting variety of NoSQL databases with specific functionalities, but be careful: having extra functionalities means less efficient response times than with the standard key-value(s). Take for example the API « Table » of Tokyo Cabinet, which enables management of conditional requests, or even the geospatial functionalities of MongoDB.
An amusing equation: NoSQL database = Memcached – TTL + synchronization of writes on disk.

SQL + NoSQL = Yes !

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale