In-memory noSQL DBMS Client in Big Data Cluster

This is guest post by Sergei Sheinin, creator of the 2DX Web UI Database Cluster Framework, a low latency big data cluster with in-memory noSQL DBMS Web Browser client.

When I began working in the field of data management the disconnect between rigid structure of relational database tables and free form of documents managed by end users and their businesses stood out as a technical and managerial hurdle. On the one hand there were strict definitions of normalized relational database models and unstructured document formats on the other. Often the users in charge of changing document structures held organizational responsibilities far removed from database modeling or programming. On one occasion I was involved in a project where call center operators made on the fly decisions to update a document structure based on phone conversations with customers. Such updates had to be streamed into a relational back-end creating havoc in database structure and build of table columns.

In seeking a permanent solution I researched merits of Entity-Attribute-Value database schema and its applications. This technique proved successful in enabling front end users to modify relational-bound documents through performing updates to structure described in their metadata. However application of EAV raised its own issues, for example accommodation of updated document metadata at times required changes to definitions of the relational tables, attention of developers due to complexity of application layer in client-server interoperability, rapidly growing fact tables and performance of multiple join statements in select queries.

I set out to address these problems by devising an EAV schema that accommodated not only the master data but also the metadata of document descriptions in a combined database schema while precluding storage of duplicate and null values. For example when there is a database table column "A", variable "A" in application code and value "A" in a table row there is only one "A" stored in the relational database. The resulting schema went farther in enabling Object Oriented interpreter within the relational database that allows the clients to update document structure through submission of metadata descriptor as input parameter. When fully developed the technique of submitting document structure metadata to OORDBMS interpreter will facilitate development of pattern recognition, predictive querying and data visualization applications.

To interface with the devised back-end OORDBMS a structure-agnostic in-memory noSQL DBMS was introduced forming a complete 2-tier client-server platform. The noSQL DBMS is a strong-typed database that mirrors data types supported by the relational database to variable types available in Web Browser JavaScript engine. Direct correlation of client-server data types introduces benefits like elimination of type mismatch errors, prevention of SQL injections and attaining low latency master table commits where I/O cost of table index re-calculation is otherwise significant to cause delays stemming from write locks.

The in-memory client DBMS is a lightweight JavaScript singleton object designed for storage, management and querying of document-oriented data sets that are indexed for reduced Big-O notation in performed operations. The JS object is integrated with a high performance DOM tree rendering functionality for instant display of memory content on HTML web pages and supplies API for popular JavaScript frameworks like Angular or jQuery through respective drivers. Support of third-party frameworks expands the object’s UI functionality to numerous widely used Web development tools. Its HTML rendering functionality is a memory-persisted Single Page Application environment capable of displaying web pages with over two million DOM elements. It has displayed exceptional HTML rendering speed due to memory-persistence of the web page content.

2DX client and server follow database model designed to accommodate document-oriented data sets in normal form developed through reductionism of relational data models in respect to format of JavaScript objects. Sharing a unified model across all of client instances as well as the back-end server advances transparency in recording client-initiated data updates which are represented as time-stamped transaction log records in format of serialized strings. 2DX enables rewind and replay of transactions for analysis of data changes taking place across the cluster at specific time intervals of its operation. The client instances may be regarded as transaction log viewers with each instance assigned its slice of cumulative transaction volume according to log-in credentials and application settings.

2DX server is an RDBMS hosting Object-Oriented interpreter implemented with Stored Procedures. Its purpose is to convert serialized data to and from normalized form and execute application logic. The server introduces lazy transaction processing scheme where write requests undergo a series of validity checks of low system resource utilization. Once a pending write request is validated it is committed to write buffer whose size is reset to empty after updates are transferred to the master tables in asynchronous batch. The cluster mirrors all data updates to a noSQL server for superior speed in update validation and data retrieval. The updates are made visible to rest of the cluster instances immediately following commit to the write buffer eliminating the delay of master table indexes recalculation.