MySQL 提供 MyISAM 、 InnoDB 、 memory(heap) 等多種存儲引擎。每種存儲引擎對于索引的支持以及實現(xiàn)都不盡相同,
本文主要討論 InnoDB 引擎相關(guān)的索引應(yīng)用。
為何使用索引
索引用于快速找出在某個列中有一特定值的行。在查詢時如果沒有應(yīng)用索引,MySQL 將不得不掃描表以找出符合條件的數(shù)據(jù)項,我們知道,IO操作是非常耗時的。
建立索引的幾個原則
盡量使用唯一索引,對于有唯一值的列索引效果最好,對于像性別只有很少的值的列索引效果就不明顯。
索引長度盡量短,這樣做有幾個好處,首先短的索引可以節(jié)省索引空間,也會使查找的速度得到提升。對于較短的鍵值,索引高速緩存中的塊能容納更多的鍵值,MySQL也可以在內(nèi)存中容納更多的值。
太長的列,可以選擇只建立部分索引
更新非常頻繁的數(shù)據(jù)不適宜建索引
利用最左前綴原則,比如建立了一個聯(lián)合索引(a,b,c),那么其實我們可利用的索引就有(a), (a,b), (a,b,c)
不要過多創(chuàng)建索引,首先過多的索引會占用更多的空間,而且每次增、刪、改操作都會重建索引,而且過多索引也會增加之后的優(yōu)化復(fù)雜度
盡量擴展索引,比如現(xiàn)有索引(a),現(xiàn)在我又要對(a,b)進行索引,不需要再建一個索引(a,b)
請注意,一次查詢是不能應(yīng)用多個索引的
<,<=,=,>,>=,BETWEEN,IN 可用到索引,<>,not in ,!= 則不行
like "xxxx%" 是可以用到索引的,like "%xxxx" 則不行(like "%xxx%" 同理)
NULL會使索引的效果大打折扣
淺談聯(lián)合索引
首先因為 InnoDB 的數(shù)據(jù)文件本身要按主鍵聚集,所以InnoDB要求表必須有主鍵(MyISAM可以沒有),如果沒有顯式指定,則MySQL系統(tǒng)會自動選擇一個可以唯一標(biāo)識數(shù)據(jù)記錄的列作為主鍵,如果不存在這種列,則MySQL自動為InnoDB表生成一個隱含字段作為主鍵。主鍵對于 InnoDB 的索引結(jié)構(gòu)是十分重要的。
InnoDB 引擎的索引是使用 B+樹 實現(xiàn)索引結(jié)構(gòu)的,當(dāng)我們建立一個聯(lián)合索引(a, b, c)時,B+樹將按照從左至右來建立搜索樹,然后檢索時B+樹將先比較 a 然后再其基礎(chǔ)上比較 b 和 c 。不難看出如果我們的搜索條件中只有 a 和 c ,將不能使用完整的(a, b, c)索引,如果我們的搜索條件中沒有 a 那么這條查詢將不會用上索引,這其實就是最左前綴特性。
** 接下來我們來看下聯(lián)合索引應(yīng)用時的幾種情況: **
desc users;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| rname | varchar(50) | NO | | NULL | |
| rdesc | varchar(50) | NO | | NULL | |
| age | int(11) | NO | MUL | NULL | |
| card | int(5) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
show index from users;
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| users | 0 | PRIMARY | 1 | id | A | 15 | NULL | NULL | | BTREE | | |
| users | 1 | age_name_desc | 1 | age | A | 15 | NULL | NULL | | BTREE | | |
| users | 1 | age_name_desc | 2 | rname | A | 15 | 10 | NULL | | BTREE | | |
| users | 1 | age_name_desc | 3 | rdesc | A | 15 | 15 | NULL | | BTREE | | |
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
- 全列匹配
explain select * from users where age = 11 and rname = "asd" and rdesc = "asd";
+----+-------------+-------+------+---------------+---------------+---------+-------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------------+---------+-------------------+------+-------------+
| 1 | SIMPLE | users | ref | age_name_desc | age_name_desc | 83 | const,const,const | 1 | Using where |
+----+-------------+-------+------+---------------+---------------+---------+-------------------+------+-------------+
以上 explain 的結(jié)果可以看出,當(dāng)對索引中所有列進行精確匹配的時候,可以用到完整索引。
explain select * from users where rname = "asd" and age = 11 and rdesc = "asd";
+----+-------------+-------+------+---------------+---------------+---------+-------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------------+---------+-------------------+------+-------------+
| 1 | SIMPLE | users | ref | age_name_desc | age_name_desc | 83 | const,const,const | 1 | Using where |
+----+-------------+-------+------+---------------+---------------+---------+-------------------+------+-------------+
按照B+樹的結(jié)構(gòu)聯(lián)合索引本是對順序十分敏感的,但是從以上結(jié)果可看出調(diào)整順序并沒有影響到索引的選用,這是因為MySQL的查詢優(yōu)化器會自動調(diào)整where子句的條件順序以使用適合的索引。
- 部分匹配(左前綴)
explain select * from users where age = 11;
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------+
| 1 | SIMPLE | users | ref | age_name_desc | age_name_desc | 4 | const | 1 | NULL |
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------+
從上述結(jié)果可看出當(dāng)精確匹配最左前綴的列時,是可以用到索引的,但是 key_len = 4,只用到了第一列前綴索引。
explain select * from users where age = 11 and rdesc = "asd";
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------------+
| 1 | SIMPLE | users | ref | age_name_desc | age_name_desc | 4 | const | 1 | Using where |
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------------+
上述查詢?nèi)笔Я藃name列,可以看出當(dāng)缺失中間列時將不能使用完整的聯(lián)合索引。查詢只用到了最左部分索引,而rdesc由于沒有rname無法和左前綴銜接。
explain select * from users where rname = "asd";
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 15 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
上述查詢由于查詢條件不是最左前綴,將不能使用聯(lián)合索引age_name_desc,建立聯(lián)合索引時順序是很重要的,必須在建索引前考慮清楚。
- 非精確查詢
explain select * from users where age = 11 and rname like 'sad%';
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| 1 | SIMPLE | users | range | age_name_desc | age_name_desc | 36 | NULL | 1 | Using index condition; Using where |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
explain select * from users where age = 11 and rname like 'sad%' and rdesc = 'asd';
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| 1 | SIMPLE | users | range | age_name_desc | age_name_desc | 83 | NULL | 1 | Using index condition; Using where |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
可以看出like查詢當(dāng)通配符'%'不出現(xiàn)在開頭時,是可以應(yīng)用到索引的。
explain select * from users where age = 11 and rname like '%sad%';
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------------+
| 1 | SIMPLE | users | ref | age_name_desc | age_name_desc | 4 | const | 1 | Using where |
+----+-------------+-------+------+---------------+---------------+---------+-------+------+-------------+
通配符'%'出現(xiàn)在開頭則不行
接下來我們看看范圍查詢
explain select * from users where age = 61 and rname < 'fasd';
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| 1 | SIMPLE | users | range | age_name_desc | age_name_desc | 36 | NULL | 1 | Using index condition; Using where |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
- 函數(shù)和表達式
explain select * from users where (age + 1) = 33;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 15 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
當(dāng)查詢條件中出現(xiàn)函數(shù)或表達式時,將不能應(yīng)用索引。寫查詢語句時要盡量避免表達式和函數(shù)的出現(xiàn)。
指導(dǎo)性建議
建議在選擇性高的列上建立索引,所謂索引的選擇性,是指不重復(fù)的索引值與表記錄數(shù)的比值:
Index Selectivity = Cardinality / Count
顯然選擇性的取值范圍為(0, 1],選擇性越高的索引價值越大。
比例越大我們掃描的記錄數(shù)越少,唯一鍵的區(qū)分度是1,而一些狀態(tài)、性別字段可能在大數(shù)據(jù)面前區(qū)分度就趨近于0
select count(distinct(rname))/count(*) as selectivity from users;
+-------------+
| selectivity |
+-------------+
| 0.9333 |
+-------------+
有一種與索引選擇性有關(guān)的索引優(yōu)化策略叫做前綴索引,就是用列的前綴代替整個列作為索引key,當(dāng)前綴長度合適時,可以做到既使得前綴索引的選擇性接近全列索引,同時因為索引key變短而減少了索引文件的大小和維護開銷。
select count(distinct(left(rname, 3)))/count(*) as selectivity from users;
+-------------+
| selectivity |
+-------------+
| 0.7333 |
+-------------+
select count(distinct(left(rname, 5)))/count(*) as selectivity from users;
+-------------+
| selectivity |
+-------------+
| 0.9333 |
+-------------+
可以看到,當(dāng)把前綴取到5時selectivity值就和完整的selectivity值一樣了,這樣可以大幅度減小索引所占用的空間,而且相應(yīng)的查詢速度也會有一定提升。