How Do SQL Database Engines Work?

 



INTRODUCTION

SQL (Structured Query Language) is a language used to communicate with relational databases.

Wait, what is this relational database that we are talking about?

 relational databasea collection of data items with pre-defined relationships between them.

The objective of SQL is to manage large amounts of data, especially if there’s lots of data being written simultaneously and too many data transactions.

The data management begins when the SQL client communicates with a database — and thus comes the ability to create/drop/collect/store data, update or delete that data, extract that data, and manage user permissions to that data. This system is called the relational database management system.

SQL itself is a “spec” — a generalized language that supports the basics. 

Its a cool definition, but how SQL works really?

There are some steps to understand the process of a SQL work:

1.Where does the data come from?

There are different methodologies for collecting and organizing data, but the fundamentals are the same. Data comes to a server and a data storage system such as from an Apache or Nginx platform can then process that data into tables and store it in a data warehouse server for SQL to use. 

This is usually done by converting the data to a format that the database can use (such as a JSON file) because we don’t want to write directly to the database itself. Do you remember (serialization/deserialization)?

The database warehouse is the core database engine that allows a SQL client to connect and communicate with it. To retrieve data, the database warehouse forwards the SQL request to an application server.

The application server then processes it and sends it back to a web server which turns it into presentable content to the user — SQL data tables.

Relation between SQL and client or user.

2. The SQL Database Engine — Stages of Compiling

Let’s now take a closer look at how that database engine works. We now know that the data warehouse is the RDBMS that allows a SQL client to communicate with it. 

Wait again, what´s a RDBMS?

The software used to store, manage, query, and retrieve data stored in a relational database is called a relational database management system (RDBMS). The RDBMS provides an interface between users and applications and the database, as well as administrative functions for managing data storage, access, and performance.

SQL has its own stages for how a query compiles and executes. The taxonomy for these stages of processing in SQL varies depending on the client but it’s generally called a SQL driver or SQL statement

As Microsoft says,  there are some steps to process the SQL statement:

1. Compiling (Parsing): Tokenizes the statement into individual words with valid syntax and clauses.

2. Compiling (Checks semantics): Validates the statement by checking the SQL statement against the system’s catalog and seeing it these databases, tables, and columns that the user wants to exist and if the user has privileges to execute the SQL query.

3. Compiling (Binding): Generates a query plan for the statement which is the binary representation of the steps required to carry out the statement. In almost all SQL server engines, it will be in byte code. What has now been compiled is a command line shell — a program that reads SQL statements and now sends them to the database server for optimization and execution.

4. Optimizing: Optimizes the query plan and chooses the best algorithms such as for searching and sorting. This feature is called the Query Optimizer or Relational Engine. Once this is done, we now have a prepared SQL statement.

5. Executing: The RDBMS executes the SQL statement by running the query plan.

From: https://docs.microsoft.com/en-us/sql/odbc/reference/processing-a-sql-statement?view=sql-server-2017

3. The SQL Database Engine — Storage Engine, Relational Engine, and Execution Engine

The Storage Engine and the Query Optimizer (Relational Engine) are essentially the two core components of a SQL Database Engine.

The SQL Storage Engine is a software module used to create, read, and update data between the disk and memory while still maintaining data integrity (rollback journals and write-ahead logs).

You can use this command to find out what storage engines are available:

mysql> SHOW ENGINES\G

A cool feature of SQL is that all SQL clients have a SQL Query Optimizer.

This optimizer chooses the algorithm to use depending on the query that was written. It’s similar to choosing an algorithm in programming languages based on time complexity, but instead the Query Optimizer chooses an algorithm by making an access plan / query plan as a cost-based optimizer.

Once it chooses the best plan, it sends that query plan over to the Execution Engine to run the SQL statement.

4. Data -> Tables

SQL is usually written from a low level programming language such as C. Because of this, the SQL database engine can turn incoming data into tables of columns and records. It does this with B-treesa type of binary search tree with several branches that keeps pointing to new data elements. These B-tree structures are what allows the data to be turned into a table of columns with keys being used as “pointers” to other keys.



From: https://www.red-gate.com/simple-talk/sql/sql-training/the-sql-server-query-optimizer/


From: https://www.youtube.com/watch?v=Z_cX3bzkExE

5. Conclusion

The SQL database engine is a very important tool to have in almost any database environment. It is built with low-level programming languages and has its own stages for compiling and executing. It selects the best algorithms by comparing it to alternatives and creates tables by using binary trees. People can use different platforms all with different code and have them all connected to a SQL database engine. SQL allows for sharing and taking care of the data with a well-defined schema, automatic algorithm selection, automatic index maintenance, and fixing performance problems without have to recode and retest.


Bibliography and credits:

https://www.red-gate.com/simple-talk/databases/sql-server/performance-sql-server/the-sql-server-query-optimizer/






















Comentarios

Entradas populares de este blog

C static libraries

Linux Basics: Static Libraries vs. Dynamic Libraries