10 Reasons to Consider a Multi-Model Database
This is a guest post by Nikhil Palekar, Solutions Architect, FoundationDB.
The proliferation of NoSQL databases is a response to the needs of modern applications. Still, not all data can be shoehorned into a particular NoSQL model, which is why so many different database options exist in the market. As a result, organizations are now facing serious database bloat within their infrastructure.
But a new class of database engine recently has emerged that can address the business needs of each of those applications and use cases without also requiring the enterprise to maintain separate systems, software licenses, developers, and administrators.
These multi-model databases can provide a single back end that exposes multiple data models to the applications it supports. In that way, multi-model databases eliminate fragmentation and provide a consistent, well-understood backend that supports many different products and applications. The benefits to the organization are extensive, but some of the most significant benefits include:
1. Consolidation
In the NoSQL space in particular, engineers face many choices when deciding how to model and store data. Those choices are complicated by the fact that most database management systems tightly couple all the different “levels” of their technology stacks, such as the storage engine, data model, and query language. A multi-model database supports different types of data for different use cases and consolidates them on one platform. So you gain flexibility in the query language and data model but simultaneously benefit from a common storage engine technology. Minimizing the components that have to be maintained at that low level within the backend allows infrastructure to be more commoditized, leading to lower total costs of ownership and increased flexibility as infrastructure needs change.
2. Performance scaling
As the use of an application grows, the need for database performance grows too. But the exact performance needs of an application may change, and with many database systems, a user’s only option is to scale the system “vertically”--using a single, larger machine that has increased performance or capacity. Multi-model systems that decouple the query language and data model from the underlying data store allow different components within the architecture to be scaled independently as needs change. So various parts of the backend system can be scaled out horizontally in response to increased throughput or storage requirements, whether that’s because a new application comes online or an existing application workload changes. And fortunately, when the performance demands decrease, it’s simple to scale down the backend system to save on hardware costs and operations efforts.
3. Operational complexity
The fragmented environments caused by running different databases increase the complexity of both operations and development. The goal of polyglot persistence is to use the best component for the job, but in practice it means you may end up with multiple databases, each with its own storage and operational requirements. Just integrating those systems is a difficult operational challenge, and attempting to integrate them into a cohesive, larger system that applications can use--especially when trying to maintain data consistency and fault tolerance--may be nearly impossible.
4. Flexibility
It's often awkward and inefficient to shoehorn lots of data into a single data model. A multi-model approach involves mapping multiple data models onto a single underlying storage engine that can support different use cases and applications. This approach provides flexible data modeling without the complexity of operating multiple data stores.
5. Reliability
Database reliability is also an issue when running multiple databases since each database system could be a single point of failure for the larger system and application. With some systems, recovering from machine failures may require hours of coordination and processing before full application connectivity and functionality can be restored to users and customers. The costs of that downtime, whether it’s expected or unexpected, can be tremendous both in terms of monetary costs and user engagement and experience with the application.
6. Data consistency
Without higher-level transaction functionality built into your application, there is no support for transactions across different database systems. Consequently, there is no good way to maintain consistency among different models. Suppose your application receives a stream of data on user activity, and you decide to store related data elements in a time series structure, graph format, and document format. You usually require these components to reflect a consistent state, but without ACID transactions that work across those data models, this requirement can be difficult if not impossible to achieve. A single backend system that supports multiple data models based on application requirements, however, can achieve this goal.
7. Fault tolerance
Ensuring that a system with many components of any kind is fault-tolerant is not an easy task to say the least. And integrating multiple systems that were designed to run independently so they provide fault tolerance across the system as a whole imposes significant engineering and operational costs. Unfortunately, deployments of heterogeneous systems require that your team have expertise with each component so the overall system keeps running well. Because each system is different and has different requirements, however, this approach is time consuming and expensive. And even then, the fault tolerance of your whole system then depends on the weakest subsystem in the backend.
8. Cost
Using more, distinct database systems increases costs based on hardware, software, and operational needs associated with each system. Although each tool may have been adopted to solve a specific business problem, in the aggregate, the costs of this piecemeal approach can add up very quickly and likely will only increase over time. Each component requires ongoing maintenance, including the stream of updates, patches, bug fixes, and other software modifications delivered by each vendor. In addition, each change to each component requires the organization to test the new component(s), make any necessary changes to applications and products, and then execute a process to release all these changes into the production environment.
9. Transactions
Relational database systems, typically deployed on a single machine, generally provide strong transactional guarantees for database operations, which allowed applications and application developers to understand with strong certainty the current state of the database at a given point in time. However, it’s challenging to provide transactions across multiple machines, and almost all NoSQL databases do not provide transactional guarantees due to their architectural designs. Because a true, multi-model system requires transactions to ensure that data is stored consistently in the database, all your applications inherit this strong contract of how data is stored. Although this benefit may not seem significant since single-machine relational database can provide transactions, the benefits of transactions as part of a distributed, fault-tolerant system that you can flexibly scale are tremendous.
10. Better applications
Trying to run different databases to power an application can be an operational and development nightmare. In contrast, an application that is supported by a multi-model database gets the benefits of scalability, fault tolerance, and in a well-engineered system, high performance built into the product. With less extra logic needed at the application level to handle database interactions and potential failure conditions, developers can focus on building better applications. Because of these benefits, multi-model systems are where the database market is heading--ACID-compliant transactions, multi-model APIs, and shared, powerful storage engines that can better meet the requirements of demanding applications.