advertise
« Apple iCloud: Syncing and Distributed Storage Over Streaming and Centralized Storage | Main | Stuff The Internet Says On Scalability For June 3, 2011 »
Monday
Jun062011

NoSQL Pain? Learn How to Read/write Scale Without a Complete Re-write

Lately I've been reading more cases were different people have started to realize the limitations of the NoSQL promise to database scalability. Note the references below:

Take MongoDB for example. It's damn fast, but it doesn't really know how to save data reliably to disk. I've had it set up in a replica pair to mitigate that risk. Guess what - both servers in the pair failed and corrupted their data files at the same day.

It appears that for many, the switch to NoSQL can be rather painful. IMO that doesn't necessarily mean that NoSQL is wrong in general, but it's a combination of 1) lack of maturity 2) not the right tool for the job.

That brings the question of what's the alternative solution?

In the following post I tried to summarize the lessons from  Ronnie Bodinger (Head of IT at Avanza Bank AB) presentation on how they turned their current read-mostly scale architecture into a complete read/write scale without a complete re-writing of their existing application and while keeping the database as-is.

The lessons learned:

  • Minimize the change by clearly Identifying the scalability hotspots
  • Keep the database as is
  • Put an In Memory Data Grid as a front end to the database
  • Use write-behind to reduce the synchronization overhead
  • Use O/R mapping to map the data back into its original format
  • Use standard Java API and framework to leverage existing skillset
  • Use two parallel (old/new) sites to enable gradual transition
  • Use RAM for high performance access and disk for long term storage
  • Use commodity Database and HW

For a more detailed explanation read more here.

Reader Comments (6)

Finally, some discussion about when NoSQL breaks.

Great that it's fast, but how do I fix it when it falls over.

June 6, 2011 | Unregistered Commentersime

This is a strange conflation of mongodb with "nosql."

A single mongodb bug does not justify any broad conclusions about anything, except perhaps the quality of mongodb.

June 6, 2011 | Unregistered CommenterDave R

@Dave, 100% agreed. Mongo != NoSQL. NoSQL = {Mongo, Redis, Couch, Cassandra, etc...}

I've been a MySQL fan since mSQL and MySQL back in 1997; stable, reliable, actually pretty fast when architected properly, and somewhat read-scalable with replication and write-scalable with NDB. However, maintenance can get to be a chore (as with any SQL solution) with more than five or six nodes. I've also been a NoSQL convert for many years, but even fantastic solutions like Redis are still pretty immature. (For instance, Redis' Virtual Memory is now something that Salvatore is backing away from, and with good reason, and running out of memory is not dealt with gracefully.)

(Most) NoSQL solutions are terrific, as is mySQL. they both have a very real place in the data center. Keep in mind that things like MySQL are actually built ultimately on top of a NoSQL solution. For instance, MySQL is built on top of Berkeley DB (Sleepycat), which was the key-value de facto standard DB for years before the term NoSQL was even invented.

If I had to choose just one, I'd choose MySQL but that's because it's kind of a swiss army knife and can do a lot. Fortunately, I don't have to choose just one. ;-)

June 7, 2011 | Unregistered CommenterJamieson Becker

"doesn't really know how to save data reliably to disk"
This obviously refers to a very old MongoDB Version. This (what you call) "bug" is long gone.

"Guess what - both servers in the pair failed and corrupted their data files at the same day"
This is not NoSQL specific. Replica sets in different environments would be a good idea..

June 8, 2011 | Unregistered CommenterStefan Edlich

To trash NoSql for selling GigaSpaces is so uncool!
Now imagine the data grid failing when write-behind is way behind :(
I rather go with write to disks solution every day.

June 9, 2011 | Unregistered CommenterAdi

Adi

Do you call that trashing?

"It appears that for many, the switch to NoSQL can be rather painful. IMO that doesn't necessarily mean that NoSQL is wrong in general, but it's a combination of 1) lack of maturity 2) not the right tool for the job."

"Now imagine the data grid failing when write-behind is way behind :("

If you ever used data-grid you'd know that a data-grid failure doesn't loose the state of the log to the database as the log is synchronously replicated to another backup node which will continue the synch to the database.

"I rather go with write to disks solution every day."

Good luck! - one things that you'll notice is that to get the performance that you expect write to disk isn't synched to disk. The reliability is guaranteed through replication to other backup node in just the same way as with Data Grid.

March 22, 2012 | Registered CommenterNati Shalom

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>