数据库
首页 > 数据库> > MySQL问题分析-是谁偷走了我的IO

MySQL问题分析-是谁偷走了我的IO

作者:互联网

前言

对于刚接触MySQL数据库不久的小伙伴来说,或多或少会遇到一些棘手的小问题;比如在数据库磁盘无损坏的情况下,我们经常会遇到数据库服务器的磁盘io压力很大,但由于服务器上只有MySQL程序,我们第一想法肯定是某个连接用户在连到数据库后,背着我们在瞎搞着什么;然而通过show processlist命令,却并没有看到不良的操作,可恶。下面模拟一个简单场景,让我们揪出这狗人的邪恶操作。

场景模拟,制造IO压力

[root@tidb-master ~]# /usr/share/sysbench/oltp_read_write.lua --mysql-host=10.186.61.36 --mysql-user=sun --mysql-password='xx' --mysql-port=3306 --mysql-db=testdb --tables=1 --table_size=10000000 --auto_inc=off --report-interval=2 --db-ps-mode=disable --time=100000 --threads=1 prepare
sysbench 1.0.17 (using system LuaJIT 2.0.4)

Creating table 'sbtest1'...
Inserting 10000000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...
[root@tidb-master ~]# /usr/share/sysbench/oltp_read_write.lua --mysql-host=10.186.61.36 --mysql-user=sun --mysql-password='xx' --mysql-port=3306 --mysql-db=testdb --tables=1 --table_size=10000000 --auto_inc=off --report-interval=2 --db-ps-mode=disable --time=100000 --threads=1 run
sysbench 1.0.17 (using system LuaJIT 2.0.4)

Running the test with following options:
Number of threads: 1
Report intermediate results every 2 second(s)
Initializing random number generator from current time


Initializing worker threads...

Threads started!

[ 2s ] thds: 1 tps: 19.48 qps: 399.05 (r/w/o: 279.69/79.91/39.46) lat (ms,95%): 80.03 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 1 tps: 30.01 qps: 599.76 (r/w/o: 420.18/119.55/60.03) lat (ms,95%): 69.29 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 1 tps: 38.00 qps: 751.49 (r/w/o: 524.99/150.50/76.00) lat (ms,95%): 53.85 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 1 tps: 41.00 qps: 827.05 (r/w/o: 581.03/164.01/82.00) lat (ms,95%): 61.08 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 1 tps: 35.50 qps: 709.43 (r/w/o: 496.45/141.99/70.99) lat (ms,95%): 64.47 err/s: 0.00 reconn/s: 0.00

问题分析,捉出狗人

1、通过top命令发现io wait比较高

top - 16:46:49 up 176 days,  6:46,  1 user,  load average: 3.72, 2.39, 3.04
Tasks: 182 total,   1 running, 181 sleeping,   0 stopped,   0 zombie
%Cpu0  :  4.8 us,  2.0 sy,  0.0 ni, 91.1 id,  1.7 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu1  : 53.2 us, 10.8 sy,  0.0 ni, 14.6 id, 13.9 wa,  0.0 hi,  7.1 si,  0.3 st
%Cpu2  : 11.7 us,  4.8 sy,  0.0 ni, 73.8 id,  9.3 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu3  : 11.4 us,  3.1 sy,  0.0 ni, 78.9 id,  6.6 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :  3.8 us,  1.4 sy,  0.0 ni, 93.4 id,  1.0 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu5  :  5.2 us,  2.8 sy,  0.0 ni, 89.6 id,  2.4 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  6.2 us,  2.4 sy,  0.0 ni, 83.3 id,  7.6 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu7  :  6.8 us,  2.1 sy,  0.0 ni, 86.0 id,  4.8 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu8  :  1.7 us,  1.4 sy,  0.0 ni, 94.3 id,  2.4 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu9  : 16.1 us,  5.1 sy,  0.0 ni, 70.9 id,  7.9 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 : 10.3 us,  2.7 sy,  0.0 ni, 82.5 id,  4.1 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu11 : 10.1 us,  3.1 sy,  0.0 ni, 75.0 id, 11.8 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 12137024 total,  5308908 free,  3601408 used,  3226708 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  7555800 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                             
 8858 actiont+  20   0 4867792 697536  15840 S 172.4  5.7   6:12.06 mysqld                                                                                                              
26447 tidb      20   0 4000364   1.0g   5384 S   6.0  9.0   9785:41 TiFlashMain                                                                                                         
22761 actiont+  20   0 2526348  37548  10980 S   4.3  0.3   1031:05 ustats                                                                                                              
25894 tidb      20   0   11.2g 159668  13560 S   2.3  1.3   3682:45 pd-server                                                                                                           
25975 tidb      20   0 2637164   1.0g   7392 S   2.0  8.8   4121:29 tikv-server                                                                                                         
25967 actiont+  20   0 1879304  16292   6112 S   1.3  0.1 166:08.91 uguard-agent

2、通过iostat命令进一步分析,我们看到磁盘比较繁忙,忙于读写中(%uitl值比较大,一般来说,值大于75是表明磁盘比较繁忙了)

[root@tidb-tikv01 ~]# iostat -x 1
Linux 3.10.0-862.14.4.el7.x86_64 (tidb-tikv01)  04/19/2022      _x86_64_        (12 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.20    0.00    0.44    0.17    0.10   98.09

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.01     0.18    3.15   21.14    80.79   232.37    25.79     0.19    7.69    6.15    7.92   0.75   1.82
scd0              0.00     0.00    0.00    0.00     0.00     0.00     6.00     0.00    0.50    0.50    0.00   0.50   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           9.60    0.00    3.43    9.43    0.09   77.46

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00     0.00 1328.00 3237.00 21248.00 39091.50    26.44     2.71    0.59    1.09    0.39   0.17  79.50
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          11.27    0.00    3.76    9.31    0.00   75.66

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00     0.00 1304.00 3561.00 20864.00 45302.00    27.20     2.85    0.58    1.16    0.37   0.16  79.60
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           8.06    0.00    2.66   19.04    0.17   70.07

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00     0.00  917.00 3091.00 14672.00 43048.50    28.80     8.35    2.09    3.19    1.76   0.21  82.50
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

3、通过iop命令可以定位是哪个线程消耗IO比较厉害,根据以下显示,发现线程值为22056的MySQL线程ID占用IO高达50.62%

Total DISK READ :       5.78 M/s | Total DISK WRITE :      16.91 M/s
Actual DISK READ:       6.01 M/s | Actual DISK WRITE:      23.69 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                   
22056 be/4 actionte    4.83 M/s  233.93 K/s  0.00 % 50.62 % mysqld --defaults-file=/opt/mysql/etc/3306/my.cnf --daemonize ~ch-mysql --socket=/opt/mysql/data/3306/mysqld.sock --port=3306
 8887 be/4 actionte    0.00 B/s    0.00 B/s  0.00 %  6.31 % mysqld --defaults-file=/opt/mysql/etc/3306/my.cnf --daemonize ~ch-mysql --socket=/opt/mysql/data/3306/mysqld.sock --port=3306
 8911 be/4 actionte  211.29 K/s    0.00 B/s  0.00 %  2.54 % mysqld --defaults-file=/opt/mysql/etc/3306/my.cnf --daemonize ~ch-mysql --socket=/opt/mysql/data/3306/mysqld.sock --port=3306
22771 be/4 actionte    0.00 B/s    3.77 K/s  0.00 %  1.72 % ustats

4、将上面获取的线程ID带入到MySQL中,便定位到是数据库中哪个连接搞的鬼了,抽死你丫的(performance_schama.threads中THREAD_OS_ID为系统线程ID)

[root@tidb-tikv01 ~]# /opt/mysql/base/8.0.21/bin/mysql -S /opt/mysql/data/3306/mysqld.sock -pxx
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 13832
Server version: 8.0.21 MySQL Community Server - GPL

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show processlist;
+-------+-----------------+--------------------+--------+------------------+--------+---------------------------------------------------------------+------------------+
| Id    | User            | Host               | db     | Command          | Time   | State                                                         | Info             |
+-------+-----------------+--------------------+--------+------------------+--------+---------------------------------------------------------------+------------------+
|     5 | event_scheduler | localhost          | NULL   | Daemon           | 232299 | Waiting on empty queue                                        | NULL             |
|    38 | universe_op     | 127.0.0.1:58456    | NULL   | Sleep            |      1 |                                                               | NULL             |
|    39 | universe_op     | 127.0.0.1:58454    | NULL   | Sleep            |      1 |                                                               | NULL             |
|    40 | universe_op     | 127.0.0.1:58458    | NULL   | Sleep            |      1 |                                                               | NULL             |
|    41 | universe_op     | 127.0.0.1:58460    | NULL   | Sleep            |      1 |                                                               | NULL             |
|    42 | universe_op     | 127.0.0.1:58466    | NULL   | Sleep            |      1 |                                                               | NULL             |
|    43 | universe_op     | 127.0.0.1:58462    | NULL   | Sleep            |      1 |                                                               | NULL             |
|    44 | universe_op     | 127.0.0.1:58464    | NULL   | Sleep            |      1 |                                                               | NULL             |
|    47 | universe_op     | 127.0.0.1:58656    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    48 | universe_op     | 127.0.0.1:58658    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    49 | universe_op     | 127.0.0.1:58660    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    50 | universe_op     | 127.0.0.1:58662    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    51 | universe_op     | 127.0.0.1:58666    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    52 | universe_op     | 127.0.0.1:58664    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    53 | universe_op     | 127.0.0.1:58668    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    54 | universe_op     | 127.0.0.1:58670    | NULL   | Sleep            |      7 |                                                               | NULL             |
|    56 | universe_op     | 10.186.61.45:55982 | NULL   | Binlog Dump GTID | 232282 | Master has sent all binlog to slave; waiting for more updates | NULL             |
|   231 | universe_op     | 127.0.0.1:41766    | NULL   | Sleep            |      1 |                                                               | NULL             |
| 11155 | sun             | 10.186.61.16:53874 | testdb | Sleep            |      0 |                                                               | NULL             |
| 13832 | root            | localhost          | NULL   | Query            |      0 | starting                                                      | show processlist |
+-------+-----------------+--------------------+--------+------------------+--------+---------------------------------------------------------------+------------------+
20 rows in set (0.00 sec)

mysql> select * from performance_schema.threads where THREAD_OS_ID=22056\G
*************************** 1. row ***************************
          THREAD_ID: 11200
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 11155
   PROCESSLIST_USER: sun
   PROCESSLIST_HOST: 10.186.61.16
     PROCESSLIST_DB: testdb
PROCESSLIST_COMMAND: Query
   PROCESSLIST_TIME: 0
  PROCESSLIST_STATE: statistics
   PROCESSLIST_INFO: SELECT c FROM sbtest1 WHERE id=6433600
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: TCP/IP
       THREAD_OS_ID: 22056
     RESOURCE_GROUP: USR_default
1 row in set (0.01 sec)

mysql>

总结

 对于一些新入门的小可爱来说,以上方法可以循序渐进的帮助你定位到是哪些连接线程把你的服务器IO折腾高的,可以让你更加了解Linux系统与MySQL之间的联动关系。老鸟嘛,emmm...  咦,飞走了。

标签:universe,--,0.0,0.00,MySQL,mysql,IO,偷走,NULL
来源: https://www.cnblogs.com/sunxiaoqiao/p/16166762.html