分享一篇MYSQL 索引的实验结果,利于优化和认知

MYSQL 索引以及优化

MYSQL 索引的实用案例：
假设数据库中一个表有10^6条记录，DBMS的页面大小为4K，并存储100条记录。如果没有索引，查询将对整个表进行扫描，最坏的情况下，如果所有数据页都不在内存，需要读取10^4个页面，如果这10^4个页面在磁盘上随机分布，需要进行10^4次I/O，假设磁盘每次I/O时间为10ms(忽略数据传输时间)，则总共需要100s(但实际上要好很多很多)。如果对之建立B-Tree索引，则只需要进行log100(10^6)=3次页面读取，最坏情况下耗时30ms。这就是索引带来的效果，很多时候，当你的应用程序进行SQL查询速度很慢时，应该想想是否可以建索引

数据库表，有以下索引

KEY `s_a_c` (`status`,`answer_cnt`,`cid2`,`cid3`),
      KEY `s_c_c` (`status`,`coins`,`cid2`,`cid3`),
      KEY `s_c` (`status`,`cid2`,`cid3`),
      KEY `idx_status_add` (`status`,`add_coins`,`user_update_time`),
      KEY `o_t` (`top_time`),
      KEY `s_o` (`status`,`top_time`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

一、实验背景list_10这个表中有9百万

mysql> select count(1) from list_10 \G
    *************************** 1. row ***************************
    count(1): 9185382
    1 row in set (1.39 sec)

二、测试索引字段的顺序对查询结果的影响

2.1

mysql> explain select * from list_10 where status=10 and cid2=0 and cid3=0 ;
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------+
    | id | select_type | table   | type | possible_keys                      | key  | key_len | ref               | rows  | Extra |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------+
    |  1 | SIMPLE      | list_10 | ref  | s_a_c,s_c_c,s_c,idx_status_add,s_o | s_c  | 5       | const,const,const | 21254 |       |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------+

2.2

mysql> explain select * from list_10 where cid2=0 and cid3=0 and status=10 ;
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------+
    | id | select_type | table   | type | possible_keys                      | key  | key_len | ref               | rows  | Extra |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------+
    |  1 | SIMPLE      | list_10 | ref  | s_a_c,s_c_c,s_c,idx_status_add,s_o | s_c  | 5       | const,const,const | 21254 |       |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------+

2.1 和 2.2 故意把索引字段的顺序做了改变，发现结果type都是ref，ref 都是 const，结论索引字段顺序不会影响查询效果

三、测试缺失组合索引的字段来观察影响

3.1结论索引缺失后并没有影响索引的使用ref列可以看到，但是组合索引首列缺失是无法用到的

mysql> explain select * from list_10 where cid2=0 and status=10;
    +----+-------------+---------+------+------------------------------------+------+---------+-------------+-------+-------+
    | id | select_type | table   | type | possible_keys                      | key  | key_len | ref         | rows  | Extra |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------+-------+-------+
    |  1 | SIMPLE      | list_10 | ref  | s_a_c,s_c_c,s_c,idx_status_add,s_o | s_c  | 3       | const,const | 21260 |       |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------+-------+-------+

首列缺失

3.2 以下两个图可以看出在组合索引中只要有字段采用范围符号，显示用到索引，但是实际上用不到索引列

四：在组合索引外增加个order by 主键索引和无索引字段. 真是发现无索引的字段毁所有啊. 当然用了 force index 强制一下效果能好多了.

mysql> explain select * from list_10 where cid2=0 and cid3=0 and status=10 order by ask_id desc limit 5;
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------------+
    | id | select_type | table   | type | possible_keys                      | key  | key_len | ref               | rows  | Extra       |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------------+
    |  1 | SIMPLE      | list_10 | ref  | s_a_c,s_c_c,s_c,idx_status_add,s_o | s_c  | 5       | const,const,const | 21262 | Using where |
    +----+-------------+---------+------+------------------------------------+------+---------+-------------------+-------+-------------+
    1 row in set (0.00 sec)
    
    mysql> 
    mysql> explain select * from list_10 where cid2=0 and cid3=0 and status=10 order by top_time desc limit 5;
    +----+-------------+---------+-------+------------------------------------+------+---------+------+------+-------------+
    | id | select_type | table   | type  | possible_keys                      | key  | key_len | ref  | rows | Extra       |
    +----+-------------+---------+-------+------------------------------------+------+---------+------+------+-------------+
    |  1 | SIMPLE      | list_10 | index | s_a_c,s_c_c,s_c,idx_status_add,s_o | o_t  | 4       | NULL | 2203 | Using where |
    +----+-------------+---------+-------+------------------------------------+------+---------+------+------+-------------+
    1 row in set (0.00 sec)
    
    mysql> explain select * from list_10 FORCE INDEX (s_c) where cid2=0 and cid3=0 and status=10 order by top_time desc limit 5;
    +----+-------------+---------+------+---------------+------+---------+-------------------+-------+-----------------------------+
    | id | select_type | table   | type | possible_keys | key  | key_len | ref               | rows  | Extra                       |
    +----+-------------+---------+------+---------------+------+---------+-------------------+-------+-----------------------------+
    |  1 | SIMPLE      | list_10 | ref  | s_c           | s_c  | 5       | const,const,const | 21262 | Using where; Using filesort |
    +----+-------------+---------+------+---------------+------+---------+-------------------+-------+-----------------------------+

EXPLAIN列的解释：

table：显示这一行的数据是关于哪张表的

type：这是重要的列，显示连接使用了何种类型。从最好到最差的连接类型为const、eq_reg、ref、range、indexhe和ALL

possible_keys：显示可能应用在这张表中的索引。如果为空，没有可能的索引。可以为相关的域从WHERE语句中选择一个合适的语句

key： 实际使用的索引。如果为NULL，则没有使用索引。很少的情况下，MYSQL会选择优化不足的索引。这种情况下，可以在SELECT语句中使用USE INDEX（indexname）来强制使用一个索引或者用IGNORE INDEX（indexname）来强制MYSQL忽略索引

key_len：使用的索引的长度。在不损失精确性的情况下，长度越短越好

ref：显示索引的哪一列被使用了，如果可能的话，是一个常数

rows：MYSQL认为必须检查的用来返回请求数据的行数

Extra：这里可以看到的坏的例子是Using temporary和Using filesort，意思MYSQL根本不能使用索引，结果是检索会很慢
（1）using where 使用了where子句来过滤元组
（2）using temporary 用临时表保存中间结果，常用于GROUP BY 和 ORDER BY操作中
（3）using filesort 使用文件完成排序操作，这是可能是ordery by，group by语句的结果，这可能是一个CPU密集型的过程，可以通过选择合适的索引来改进性能，用索引来为查询结果排序。
（4）using index 不需要读取数据文件，从索引树（索引文件）中即可获得信息。如果同时出现using where，表明索引被用来执行索引键值的查找，没有using where，表明索引用来读取数据而非执行查找动作。
（5）using join buffer使用了连接缓存：Block Nested Loop，连接算法是块嵌套循环连接
（6）impossible where where子句的值总是false，不能用来获取任何元组
（7）select tables optimized away在没有GROUP BY子句的情况下，基于索引优化MIN/MAX操作，或者对于MyISAM存储引擎优化COUNT(*)操作，不必等到执行阶段再进行计算，查询执行计划生成的阶段即完成优化。
（8）distinct优化distinct操作，在找到第一匹配的元组后即停止找同样值的动作

BraveDu

分享一篇MYSQL 索引的实验结果,利于优化和认知

您可以选择一种方式赞助本站

发表评论