KeySQL Server

The key to Big Data productivity.


KeySQL Server

In 1970 E.F. Codd published an 11-page paper introducing the relational model of data that brought about a multibillion dollar industry of SQL databases. This happened because SQL – “Structured Query Language” - created a disruptive productivity jump in data management. With pre-relational systems, business users needed the help of programmers to write programs to perform data manipulation and create even simple business reports. Using SQL, thousands of business analysts and other non-programmers were empowered to assemble data and create reports and dashboards on their own. The main advantage of the relational model underlying SQL is what Codd called “data structure of Spartan simplicity”. It forces data into the confines of simple tables and employs operations of relational algebra for producing new tables as the results of operations on the tables stored in a database. The advantage of “Spartan simplicity” turned into a critical shortcoming in the Big Data era since businesses started generating massive amounts of data with rich structure that could not be efficiently handled by SQL databases due to the data model and scalability limitations. This gave rise to NoSQL movement which has produced a multitude of data store solutions that employ proprietary data models offering greater flexibility than Codd’s relational model for handling of the variety of WWW-era data. However, these suffer from a number of drawbacks including the lack of a common high-level NoSQL language and the lack of a solid mathematical basis and operational closure. Therefore, while being more scalable and efficient in supporting the business operations with richly structured data, NoSQL data stores are not as friendly for use by non-programmers including business analysts. We are now at a full cycle, literally returning back to the pre-relational age when business users needed assistance of programmers to perform analytical data processing functions. In lieu of “programmers” businesses are now dependent on more expensive and difficult-to-hire “data scientists” who are supposed to have the high level of technical skills needed for extracting information from complex NoSQL data structures. In any case, the fact is that instead of a business analyst using SQL there are often two people needed – a business analyst and a data scientist proficient in Java, Scala, Python, or other programming languages required to speak natively to NoSQL data stores.

The motivation behind KeySQL project is enabling an order of magnitude productivity breakthrough in handling of WWW-era data, comparable to the productivity breakthrough achieved by Codd’s relational data model and SQL. To accomplish this, the bulk of NoSQL data must be made accessible to non-programmers using a still Spartan but far more flexible and capacious data model and a next generation SQL language that is easy-to-learn yet fully expressive for handling both the “flat” relational data and the rich NoSQL data.

The white paper explains how these capabilities including the next level of scalability are achieved by the KeySQL language and the first build of KeySQL Server.