MySQL Inner Architecture

Rajput Gaurav Singh
8 min readJul 26, 2023

MySQL is world’s most popular open source database software.It is RDBMS(Relational Database Management System). It run a server, provide multi-user access to multiple database.

It is fast, scalable, and simple to use. MySQL is compatible with a variety of operating systems, including Windows, Linux, and macOS. MySQL is a Structured Query Language (SQL) that allows you to manipulate, manage, and extract data using various Queries. It is supported by Oracle Company.

Client/Server Model

MySQL is based on Client/Server architecture where server is central program that manages database content and client programs/utilities connect to server to retrieve or modify data.It also includes non-client utility program and scripts.

Client-Server Architecture of MySQL

With the support of the Client Layer, the Client sends request instructions to the Serve. The client submits a request using valid MYSQL commands and expressions via Command Prompt or GUI screen.

General RDBMS Architecture

It is viewed as Gartan and shaw layered Architecture at the highest level of abstraction.

It has 3 main components :-

  1. Application Layer
  2. Logical Layer
  3. Physical Layer

Application Layer: Users and Clients interact with MySQL in this layer.It consists of 3 components:-

  1. Administrator Utilities/Program : Administrators use various administrative interface and utilities like mysqladmin which performs tasks like shutting down the server and creating or dropping databases, mysqldump for backing up the database or copying databases to another server.

2. Client : Clients communicate with MySQL through various interfaces and utilities like MySQL API’s. The MySQL API sends the query to the server as a series of tokens.

3. Query User : The query users interact with MySQL RDBMS through a query interface that is MySQL.

Logical Layer: The logical layer of MySQL architecture is divided into various subsystems.

  1. Query Processor.

2. Transaction Management.

3. Recovery Management.

4. Storage Management.

These subsystems worked together to processed the requests issued to the MySQL database Server.

Logical Layer

MySQL’s Logical Architecture

MySQL is very different from other database servers, and its architectural characteristics make it useful for a wide range of purposes. MySQL is not perfect, but it is flexible enough to work well in very demanding environments, such as web applications, data warehouses, content indexing, highly available redundant systems, online transaction processing (OLTP), and much more.

Execution Path of the Query.

What happens when you send MySQL a query?

  1. The client sends the SQL statement to the server.
  2. The server checks the query cache. If there’s a hit, it returns the stored result from the cache; otherwise, it passes the SQL statement to the next step.
  3. The server parses, preprocesses, and optimizes the SQL into a query execution plan.
  4. The query execution engine executes the plan by making calls to the storage engine API.

The following are the different sub-components of the MYSQL server:

  1. Thread Handling: When a client sends a request to the server, the server accepts it and connects the client. When a client connects to the server at that time, the link is given its own thread. The thread management of the Server Layer provides this thread. The Thread Handling module also takes care of client-side queries that are run by the thread.

2. Parser and Preprocessor: MySQL server needs to understand what the query is trying to do. Is it trying to read some data, update data, or delete data?

Once a query is received, it first needs to to be parsed, which involves translating it from what is essentially a textual format into a combination of internal binary structures that can be easily manipulated by the optimizer.After the data is accessible in smaller components, the parser performs Syntax and Semantics analysis and then generates a parse tree as an output.

Parse Tree after parsing SQL Query.

3. Query Optimizer: Before MySQL executes the query, it determines how to fulfill the query, i.e. what is the best approach.

Let’s look at a simple SQL query:

SELECT name FROM student_table WHERE student_id = 1;

Assume the student_table has 200k employee records. There are at least two approaches (or two plans using formal terminology):

  • Plan 1: scan all the names in the name column, for each name, check if the student_id is 1, and only return the name if its student_id = 1.
  • Plan 2: find the record with student_id = 1 using the primary index, and return the name.

Approach 2 is almost always faster. And optimizer decides to use approach 2. The next step is actually execute the plan.

MySQL uses a cost-based optimizer, which means it tries to predict the cost of various execution plans and choose the least expensive. The unit of cost was originally a single random 4 KB data page read, but it has become more sophisticated and now includes factors such as the estimated cost of executing a WHERE clause comparison.

Note: The optimizer might not always choose the best plan, for many reasons:

• The statistics could be wrong. The server relies on storage engines to provide statistics, and they can range from exactly correct to wildly inaccurate. For example, the InnoDB storage engine doesn’t maintain accurate statistics about the number of rows in a table because of its MVCC architecture.
• The cost metric is not exactly equivalent to the true cost of running the query, so even when the statistics are accurate, the query might be more or less expensive than MySQL’s approximation. A plan that reads more pages might actually be cheaper in some cases, such as when the reads are sequential so the disk I/O is faster, or when the pages are already cached in memory. MySQL also doesn’t understand which pages are in memory and which pages are on disk, so it doesn’t really know how much I/O the query will cause.
• MySQL’s idea of “optimal” might not match yours. You probably want the fastest execution time, but MySQL doesn’t really try to make queries fast; it tries to minimize their cost, and as we’ve seen, determining cost is not an exact science.

•MySQL doesn’t consider other queries that are running concurrently, which can affect how quickly the query runs.
• MySQL doesn’t always do cost-based optimization. Sometimes it just follows the rules, such as “if there’s a full-text MATCH() clause, use a FULLTEXT index if one exists.” It will do this even when it would be faster to use a different index and a non-FULLTEXT query with a WHERE clause.
• The optimizer doesn’t take into account the cost of operations not under its control, such as executing stored functions or user-defined functions.
Concluding the optimizer — can’t always estimate every possible execution plan, so it might miss an optimal plan.

4. Query Exection Engine: MySQL simply follows the instructions given in the query execution plan. To execute the query, the server just repeats the instructions until there are no more rows to examine. Query execution engine communicates with storage engine through API call’s. Functions performed by the query execution are:

- It acts as a dispatcher for all commands in the execution plan.

  • It iterates through all the commands in the plan until the batch is complete And it interacts with the storage engine to retrieve and update data from tables and indexes.

Transaction Management:

  1. It facillitates concurrent data access
  2. Provides locking facility
  3. Ensure multiple users/sessions access data simultaneously in a consistent way.
  4. Prevents data corruption or data damage.
  5. Lock manager is the sub component name that handles locking.
  6. It does so through the aid of the log manager and the concurrency-control manager. It is also responsible for issuing the COMMIT and the ROLLBACK SQL commands.

Recovery Management:

Log Manager: Logs every operation executed in the database. It stores the operation logs as MySQL commands. In case of system crash, executing these commands will bring back the database to its last stable state.

Recovery Manager: It Responsible for recovering the database to its last stable state. Uses the logs creayted by the log manager

Memory Management

Storage Management has buffer manager which allocates memory resources. It accepts the request from the execution engine, requests the details from buffer manager, receives references to data with memory from buffer manager. Returns this data to the upper layer.

SQL Server performs some operations in memory (In-Memory) for performance.

There are 2 types of memory management in SQL Server:-

Dynamic Memory Management: SQL Server does not specify any upper limit, and SQL Server occupies and uses as much memory space as it can use. however, when any other application needs memory, the partition occupied by SQL Server is allocated to that application. The main disadvantage of dynamic memory management is that SQL Server is busy during the allocation of memory to different applications.

Static Memory Management: The minimum and maximum amount of memory that SQL Server can use is determined and does not go beyond that. SQL Server does not exceed the specified maximum value, and the remaining amount of memory is available for other applications.

Now, let’s see a sample table about SQL Server Memory Configuration of a 64-bit machine. However, this sample data is only required for the SQL Server Database Engine. Additional memory will be required for additional applications, such as SQL Server Agent, Integration Services.

To read more you can refer: https://learn.microsoft.com/en-us/sql/relational-databases/memory-management-architecture-guide?view=sql-server-ver16

Physical Layer

This Storage Engine Layer(Physical Layer) of MySQL Architecture make it’s unique and most preferable for developer’s. Due to this Layer, MySQL layer is counted as the mostly used RDBMS and is widely used. In MySQL server, for different situations and requirement’s different types of storage engines are used which are InnoDB ,MyISAM , NDB ,Memory etc. These storage engines are used as pluggable storage engineer where tables created by user are plugged with them.

MySQL stores each database (also called a schema) as a subdirectory of its data directory in the underlying filesystem. Every database has a corresponding data directory. When you create a table, MySQL stores the table definition in a .frm file with the same name as the table. Thus, when you create a table named Orders, MySQL stores the table definition in Orders.frm. These .frm files does not store data but only have the format that contains the description of the table structure.

Thus, when you create a table named MyTable, MySQL stores the table definition in MyTable.frm. Because MySQL uses the filesystem to store database names and table definitions, case sensitivity depends on the platform. You can use the SHOW TABLE STATUS command to display information about tables.

mysql>SHOW ENGINES;

Show engines command will list all the storage engines supported by your server. I hope this post will give you an overall understanding of the MySQL architecture. In next post let us discuss about the different storage engines and their features.

and The output of the executed query is returned to the caller.

Thank you and keep supporting :)

--

--