【MySQL】What are the common MySQL logs used for?

【MySQL】What are the uses of common MySQL logs?

2024-07-12

The content of MySQL logs is very important and is often asked in interviews. At the same time, mastering the knowledge related to logs is also helpful for us to understand the underlying principles of MySQL and help us troubleshoot and solve problems when necessary.
Common log types in MySQL are mainly the following (for InnoDB storage engine):

Error log: records the startup, operation, and shutdown process of MySQL.
Binary log (binlog): mainly records SQL statements that change database data.
General query log: All SQL records sent by connected clients to the MySQL server. Because the amount of SQL is relatively large, it is not enabled by default and is not recommended.
Slow query log (sow querylog): Queries that take longer than long_query_time seconds to execute. This is used when troubleshooting SQL slow query issues.
Transaction logs (redo log and undo log): redo log is a redo log, and undo log is a rollback log.
Relay log: Relay log is a log generated during the replication process, which is similar to binary log in many aspects. However, relay log is for slave databases in master-slave replication.
DDL log (metadatalog): metadata operations performed by DDL statements

Binary log (binlog) and transaction log (redo log and undo log) are more important and need our special attention.

1. Slow query log

The slow query log records all query statements whose execution time exceeds long_query_time (the default is 10s, usually set to 1s). It is often used when solving SQL slow query problems (SQL execution time is too long).
Finding the slow SQL is the first step to optimize the performance of SQL statements. Then use the EXPLAIN command to analyze the slow SQL and obtain relevant information about the execution plan.
You can use the show variables like "slow_query_log" command to check whether the slow query log is enabled. It is disabled by default.

You can enable it by SET GLOBAL slow_query_log=ON

The long_query_time parameter defines how long a query takes to be defined as a slow query. The default value is 10 seconds. You can view it by using the SHOW VARIABLES LIKE'%long_query_time%'; command:

You can also modify it: set global long_query_time = 12

In actual projects, the slow query log may be relatively large, and it is not convenient to analyze it directly. We can use the official MySQL slow query analysis and tuning tool mysqldumpslowMy blog also has a simple connection to the mysqldumpslow tool:

[MySQL] mysqldumpslow tool - Summarize slow query log files - CSDN blog

1.1 How to query the number of current slow query statements?

In MySQL, there is a variable that records the number of slow query statements. You can use show global status like'%slo
w_queries%'; command to view.

1.2 How to optimize slow queries

MySQL provides us withEXPLAINcommand to obtain relevant information about the execution plan.
An execution plan refers to the specific execution method of a SQL statement after it is optimized by the MySQL query optimizer. Execution plans are usually used in SQL performance analysis and optimization. Through the results of EXPLAIN, you can learn information such as the query order of data tables, the operation type of data query operations, which indexes can be hit, which indexes will actually be hit, and how many rows of records in each data table are queried. Specifically, we can optimize SQL through the following common methods:

1. Avoid using SELECT *
SELECT * will consume more CPU.
SELECT * Useless fields increase network bandwidth resource consumption and data transmission time, especially for large fields (such as varchar,
blob、text)。
SELECT * cannot use the MySQL optimizer's covering index optimization (the "covering index" strategy based on the MySQL optimizer is extremely fast and efficient, and is a highly recommended query optimization method in the industry)
SELECT <field list> can reduce the impact of table structure changes.

2. Paging optimization

Ordinary paging takes a relatively short time when the amount of data is small.

If the amount of data becomes larger, reaching millions or even tens of millions, ordinary paging will take a very long time.

How to optimize it? You can modify the above SQL statement into a subquery.

We first query the primary key value corresponding to the first parameter of limit, and then filter and limit based on this primary key value, which will be more efficient. However, this method only works when the id is in positive order.

However, the result of the subquery will generate a new table, which will affect performance. You should try to avoid using a large number of subqueries. In addition, this method is only applicable to IDs in positive order. In complex paging scenarios, it is often necessary to filter out IDs that meet the conditions through filtering conditions. At this time, the IDs are discrete and discontinuous.

3. Do fewer joins

Alibaba Development Manual:

You can take a look at the discussion on Zhihu:

https://www.zhihu.com/question/68258877https://www.zhihu.com/question/682588774. It is recommended not to use foreign keys and cascades

Alibaba Java Development Manual:

5. Choose the appropriate field type

6. Try to use UNION ALL instead of UNION

UNION will put all the data of the two result sets into a temporary table before performing the deduplication operation, which is more time-consuming and consumes more CPU resources.
UNION ALL will no longer perform deduplication on the result set, and the obtained data contains duplicate items
However, if duplicate data is not allowed in the actual business scenario, UNION can still be used.

7. Batch Operations

When updating data in a database, use batch operations if possible to reduce the number of database requests and improve performance.

8. Use indexes correctly

There is a lot of content in this section, which will be introduced in a separate blog later

2. binlog binary log

The binlog (binary log) mainly records all operations that change the MSQL database (all DDL and DML statements executed by the database), including table structure changes (CREATE, ALTER, DROP TABLE.), table data modifications (INSERT.UPDATE, DELETE..), but does not include operations such as SELECT and SHOW that will not cause changes to the database.

You can use the show binary logs; command to view a list of all binary logs:

2.1 Binlog Format

There are 3 types of binary recording methods:

Statement mode: Every SQL statement that modifies data will be recorded in the binlog, such as inserts, updates, and deletes.
Row mode (recommended): The specific change events of each row will be recorded in the binlog.
Mixed mode: A mixture of Statement mode and Row mode. Statement mode is used by default, and it automatically switches to Row mode in a few special scenarios.

Compared with Row mode, the log file in statement mode is smaller, the disk IO pressure is also smaller, and the performance is better. However, its accuracy is worse than that of Row mode.

Before MySQL 5.1.5, the binlog format was only STATEMENT. 5.1.5 started to support ROW format binlog. From 5.1.8, MySQL started to support MIXED format binlog. Before MySQL 5.7.7, the Statement mode was used by default. Starting from MySQL 5.7.7, the Row mode is used by default.

You can use show variables like'%binlog format%'; to view the format used by binlog

2.2 Binlog Function

The main application scenario of binlog is master-slave replication. Master-slave, master-master, and master-slave are all inseparable from binlog. Binlog is needed to synchronize data and ensure data consistency.

The principle of master-slave replication is shown in the following figure:

1. The master database writes changes in the database data to the binlog
2. Connect from the database to the master database
3. The slave library will create an I0 thread to request updated binlog from the master library
4. The master library will create a binlog dump thread to send binlogs, and the I/0 thread in the slave library is responsible for receiving them. 5. The I/0 thread of the slave library writes the received binlogs to the relay log.
6. Read the relay log from the SQL thread of the slave database to synchronize data locally (that is, execute SQL again)

2.3 How to choose the timing of flushing binlog?

For the InnoDB storage engine, during the execution of a transaction, the log will first be written to the binlogcache. Only when the transaction is committed will the log in the binlogcache be persisted to the binlog file on the disk. Writing to memory is faster, and this is also done for efficiency considerations.

Because the binlog of a transaction cannot be split, no matter how large the transaction is, it must be written all at once, so the system allocates a block of memory to each thread as the binlog cache. We can control the size of the binlog cache of a single thread through the binlog_cache_size parameter. If the storage content exceeds this parameter, it must be temporarily stored on disk (swap).

So when is binlog flushed to disk? You can control the flushing timing of biglog through the sync_binlog parameter. The value range is 0-N, and the default is 0:
0: No mandatory requirement, the system decides when to write to disk
1: Binlog is written to disk every time a transaction is committed:
N: Binlog is written to disk only after every N transactions. There is a risk of loss.

Before MySQL 5.7, the default value of sync_binlog is 0. After MySQL 5.7, the default value of sync_binlog is 1. Generally, it is not recommended to set the value of sync_binlog to 0. If the performance requirements are high or there is a disk io bottleneck, you can increase the value of sync_binlog appropriately, but this will increase the risk of data loss.

2.4 Under what circumstances will binlog be regenerated?

When encountering the following three situations, MySQL will regenerate a new log file with an increasing file number.

Stop or restart the MySQL server
After using the flush logs command
After the binlog file size exceeds the threshold of the max binlog size variable.

3. redo log

We know that the InnoD8 storage engine manages storage space in pages. The data we insert into MySQL eventually exists in pages, or more precisely, data pages. In order to reduce disk IO overhead, there is also an area called Buffer Pool, which exists in memory. If the page corresponding to our data does not exist in the Buffer Pool, MSQL will first cache the page on the disk to the Buffer Pool, so that we can directly operate on the page in the Buffer Pool later, which greatly improves the read and write performance.

After a transaction is committed, the changes we made to the corresponding pages in the Buffer Pool may not have been persisted to disk. At this time, if MySQL suddenly crashes, will the changes in this transaction disappear directly?
Obviously not, if this is the case, it would clearly violate the durability of the transaction.

The MySQL InnoDB engine uses redo log to ensure transaction durability. The main function of redo log is to record page modifications, such as how many bytes are modified at a certain offset on a page and what the specific content is. Each record in the redo log contains the tablespace number, data page number, offset, specific modified data, and may even record the length of the modified data (depending on the redo log type).
When a transaction is committed, we will flush the redo log to disk according to the flushing strategy, so that even if MySQL crashes, the data that failed to be written to disk can be restored after restart, thus ensuring the durability of the transaction. In other words, the redo log enables MySQL to have crash recovery capabilities.

1. Ensure data durability

Redo Log records all modification operations on the database. When the database performs write operations (INSERT, UPDATE, DELETE), these operations are first recorded in Redo Log and then applied to the data file. In this way, even if the system fails before the data modification operation is completely written to the disk, Redo Log can ensure that the data is not lost. When the database is restored, these unfinished modification operations will be redone from Redo Log to ensure data consistency.

2. Data Recovery

Redo Log helps restore the database to a consistent state after a system crash or unexpected power outage. During the database recovery process, it checks the records in the Redo Log and reapplies all committed but not persisted data modifications to the data file to restore the data.

3. Improve write performance

To improve the performance of write operations, databases usually use a cache mechanism (such as a buffer pool) to temporarily store modification operations in memory instead of writing them to disk immediately. The existence of Redo Log makes this cache mechanism possible, because as long as the Redo Log is ensured to be persistent, there is no risk of data loss even if the data in the cache has not been written to disk.

When a transaction is committed, the redo log in the log buffer will be flushed to disk. This can be done by innodb_flush_log_at
commit parameter control. We must pay attention to setting the correct flushing strategy innodb_flush_log_at_trx_commit. Depending on the flushing strategy configured for MySQL, there may be slight data loss after MySQL crashes.

innodb_flush_log_at_trx_commitIt is an important configuration parameter in the MySQL InnoDB storage engine. It determines the log flushing and writing strategies when a transaction is committed, thus affecting data persistence and performance. It has three values: 0, 1, and 2, each representing a different flushing strategy.

innodb_flush_log_at_trx_commit = 0
- describe: When a transaction is committed, the log is written to the log buffer, but not immediately to disk. The log file is written to disk once a second, and the log buffer is flushed once a second.
- advantage: Higher performance because disk I/O operations are reduced.
- shortcoming: When the system crashes, all transactions within the last second may be lost.
- Applicable scene: Suitable for application scenarios with high performance requirements and loose requirements on data persistence.
innodb_flush_log_at_trx_commit = 1
- describe: Every time a transaction commits, the log is written to disk immediately and the log buffer is flushed to disk immediately. This is the safest setting and ensures transaction durability.
- advantage: The highest security, ensuring that each submitted transaction is persisted and that the submitted transaction will not be lost even if the system crashes.
- shortcoming: The performance is lower because each commit triggers a disk I/O operation.
- Applicable scene: Suitable for application scenarios with high requirements for data persistence, such as financial systems, e-commerce, etc.
innodb_flush_log_at_trx_commit = 2
- describe: Each time a transaction commits, the log is written to the log buffer and flushed to disk immediately, but not immediately written to the log file. The log file is written to disk once per second.
- advantage: Improves performance to a certain extent while reducing the amount of data that may be lost (at most transactions within 1 second are lost).
- shortcoming: When the system crashes, transactions within the last second may be lost.
- Applicable scene: Suitable for application scenarios that have certain balance requirements on performance and data persistence

Summary of the brushing strategy

0: The best performance, but the greatest risk, and all transactions in the last 1 second may be lost.
1: The safest, ensuring that each submitted transaction is persisted, with relatively low performance.
2: A compromise between performance and security.

4. Data loss occurs

1. The redo log is written to the log buffer but not to the page cache. At this time, the database crashes and data is lost (this data loss may occur when the value of the flushing strategy innodb_flush log_at trx_commit is 0);

2. The redo log has been written to the page cache but has not yet been written to the disk, and the operating system crashes, which may also cause data loss (this data loss may occur when the value of the flushing strategy innodb2 flush log_at trx_commit is 2).

5. What is the difference between binlog and redolog?

Binlog is mainly used for database restoration, which belongs to data-level data recovery. Master-slave replication is the most common application scenario of binlog. Redolog is mainly used to ensure the persistence of transactions, which belongs to transaction-level data recovery.
redolog is specific to the InnoDB engine, while binlog is common to all storage engines because binlog is implemented at the server layer of MySQL.
Redolog is a physical log that mainly records the modification of a page. Binlog is a logical log that mainly records all DDL and DML statements executed by the database.
Binlog is written in an appending manner, and its size is unlimited. Redolog is written in a circular writing manner, and its size is fixed. When it reaches the end, it will go back to the beginning and write the log in a circular manner.

4. Undo log

Undo Log is a log used to record data modification operations in the database system. It records the reverse operations (i.e., undo operations) of all modification operations on data during the transaction execution. Undo Log plays a key role in transaction rollback. Through Undo Log, data can be restored to the state before the transaction started, thus ensuring the atomicity of the transaction.

How does Undo Log ensure the atomicity of transactions?

The atomicity of a transaction means that all operations of a transaction are either executed completely or not executed at all. Undo Log ensures the atomicity of a transaction through the following mechanisms:

Record undo operations During the transaction execution, any modification operation on data will record the corresponding undo operation in the Undo Log before the actual modification. For example, if the transaction wants to update the value of a row of records, the old value will be recorded in the Undo Log before the update.
Rollback Transaction If a transaction fails for some reason (such as an error or explicit rollback), the database system will read the Undo Log and restore the data to the state before the transaction started according to the recorded undo operations. This ensures that the failed transaction has no impact on the database, thereby ensuring the atomicity of the transaction.

Quote:

Detailed explanation of read-write separation and database and table sharding | JavaGuide

Technology Sharing