Database

As a DBA Expert, which database would you choose?

High Scalability

22 Jan 2015 — 4 min read

This is a guest post by Jenny Richards, a professional database administrator who is currently employed at Remote DBA.

In the world of databases, there is no single silver bullet fitting for every gun. How you select the database to use is very dependent on every other factor of your work:

Who are you and what do you do?
What is your end goal – what are you working to achieve?
How much data do you intend to store?
On what language and OS platforms do your applications run?
What is your budget?
Will you also require data warehousing, decision support systems and/or BI?

Background information

The point is this, before you can answer the question of which database provider would be the best, you have to have answers to a dozen more questions, among them the ones listed above. If you have tried to implement a database solution and sought guidance from a professional, you have probably heard him or her ask a million questions before offering you any concrete advice.

On the flip side, database professionals such as developers and administrators know that at many times, clients just come looking for database solutions. They have no clue as to what a relational database is, why they need it or the differences between RDBMSs and normal Access databases or spreadsheets.

This is the 21st century and business Intelligence (BI) is here to stay. In fact, some of the most flexible and readily maintainable yet complex systems are data-driven. The aim of this article is therefore to highlight some of the most popular RDBMS platforms available and to shed some light for the less knowledgeable regarding this messy and complex world that is IT.

Comparing database systems is no different from comparing OS and/or developmental platforms e.g. Windows to UNIX/Linux or Java to VB.NET or C#.NET/.

For databases that are not pivotal in the critical operations of front office systems and applications e.g. order entry, telecommunication systems, banking etc., any RDBMS that can support a distributed system would suit perfectly. These include organizations or individuals that need data storage centers as backups to their regular files systems or after data manipulation and configuration for reference.

Obviously, each RDBMS provides its unique features – different SQL language flavors and expansions, even though you will find that nearly all of them claim to be in adherence to the ANSI SQL-92 standards.

Top-performance/ mission-critical systems

On the other hand, where the database will be necessary and even critical to the organization’s main operations i.e. forms part of the business-critical or mission-critical system, then everything else an RDBMS can offer becomes fair game for consideration. You will want the highest level of availability, data volume and performance scalability, no down-times, point-in-time recovery option, high reliability, high speeds with low latency and replication features for reporting and disaster recovery.

Obviously, such RDBMSs will require a large amount of resources, as they are costlier, increasing in cost with an improvement in performance and functionality of the various features mentioned above. This will be the focus of this article, with discussion of a few features below. If you still have any questions, contact your remote DBA guy.

HA Clustering

Considered in terms of clustering ability, the various RDBMSs features are as listed below:

Oracle RAC has the largest HA clustering capability with both scalability and load balancing ability.
The SQL server Cluster from Microsoft offers HA failover without load balancing ability.
ASE 15 Cluster from Sybase has capabilities similar to Oracle RAC.
TeraData is a distributed/networked database platform.
MySQL cluster stands superior to Oracle RAC in terms of TPC-C by performance benchmarks.
DB2/UDB from IBM had good operational capability on the pSeries from IBM or a Power Server with AIX HACMP or VCS/Sun Cluster by VERITAS.

Low latency, In-memory high performance databases

Oracle – comes with TimesTen in-memory DB
MySQL – has a cluster than runs an in-memory storage engine
ASE by Sybase also uses in-memory DB
RAMSAN/SSD acts as storage of MS SQL DBs. In this case, disk-related I/O operations will occur on the solid state disk (SSD) instead of the conventional physical disks. In so doing, it drastically reduces I/O disk latency, increasing I/O output and subsequently improving DB performance.
SQLite and eXtreme DB are other examples of in-memory DBs, two of many, but most of the remaining can support embedded systems only or can only handle one connection/user at a time.

Scalability and performance

ASE 15 Cluster and Oracle RAC both allow for addition of bigger space to the cluster where existing hardware is insufficient to meet the performance demands. All other nodes gradually get replacement, thereby safeguarding investment made towards the existing system.
MS SQL Clusters have no scalability on the instance level. This is because these clusters only support instances related to Active/Passive modes. The entire cluster (alternatively pair by pair) would therefore need replacement where the existent hardware no longer suffices.
The MySQL Cluster also allows for addition of new nodes to the cluster while the existent nodes remain active. However, it is a no-share cluster, which means that additional nodes translate to additional DB copies, more storage and a larger volume of data for replication throughout existent nodes.

System hybridization/mixing

On average database belonging to back-office and middle office operations in an organization make up storage sites for huge volumes of data. Any performance issue arising will only be detectable where data retrieval is necessary for whatever reason – sorting, grouping, making summaries or calculations. Such processes would run slower where there is a performance issue, having a direct effect on the process of data writing.

Consequently, such applications would require DBs that have seamless handling of large data volumes, as well as capability to support reader-writer blocking in an efficient manner.

Oracle - these DBs are the best since writers and readers do not block one another in Oracle. The reader does not retrieve a dirty read, rather, consistency is upheld through the entire session/transaction/process.
MS SQL – the 2005 version applies row versioning to achieve almost the same functionality as Oracle, but incurs more I/O in the tempdb.
Sybase ASE – the reader and writer still block each other, but this issue has been resolved on the Sybase IQ through application of snapshot versioning.
Other databases – there may be some databases out there without the blocking challenge, but most may not be ACID (atomicity, consistency, isolation, and durability) compliant.