为什么索引使这个查询更慢？

表有1 500 000条记录，其中1 250 000条记录='z'。
我需要选择随机不'z'字段。为什么索引使这个查询更慢？

$random = mt_rand(1, 250000); 
$query = "SELECT field FROM table WHERE field != 'z' LIMIT $random, 1";

它工作正常。

然后我决定优化它，并在表中索引field。

结果很奇怪 - 它是较慢〜3倍。我测试了它。
为什么它比较慢？不是这样的索引应该使它更快？

我ISAM

explain with index: 
id select_type table type possible_keys key key_len ref rows  Extra 
1 SIMPLE  table range field   field 758  NULL 1139287 Using 

explain without index: 
id select_type table type possible_keys key key_len ref rows  Extra 
1 SIMPLE  table ALL NULL   NULL NULL  NULL 1484672 Using where

来源

2010-07-30 Qiao

它是什么引擎？显示EXPLAIN – Mchl 2010-07-30 22:41:05

摘要

的问题是，field不是索引一个很好的候选人，由于b-trees性质。

说明

让我们假设你有一个具有50万次掷硬币，其中折腾或者是1（头）或0（尾）结果的表格：

CREATE TABLE toss (
    id int NOT NULL AUTO_INCREMENT, 
    result int NOT NULL DEFAULT '0', 
    PRIMARY KEY (id) 
) 

select result, count(*) from toss group by result order by result; 
+--------+----------+ 
| result | count(*) | 
+--------+----------+ 
|  0 | 250290 | 
|  1 | 249710 | 
+--------+----------+ 
2 rows in set (0.40 sec)

如果您想要选择一个折腾（随机）抛掷尾巴，然后你需要搜索你的桌子，选择一个随机的起点。

select * from toss where result != 1 limit 123456, 1; 
+--------+--------+ 
| id  | result | 
+--------+--------+ 
| 246700 |  0 | 
+--------+--------+ 
1 row in set (0.06 sec) 

explain select * from toss where result != 1 limit 123456, 1; 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+ 
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra  | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+ 
| 1 | SIMPLE  | toss | ALL | NULL   | NULL | NULL | NULL | 500000 | Using where | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+

你会发现你基本上是按顺序搜索所有行来找到一个匹配。

如果您在toss字段上创建索引，那么您的索引将包含两个值，每个值包含大约250,000个条目。

create index foo on toss (result); 
Query OK, 500000 rows affected (2.48 sec) 
Records: 500000 Duplicates: 0 Warnings: 0 

select * from toss where result != 1 limit 123456, 1; 
+--------+--------+ 
| id  | result | 
+--------+--------+ 
| 246700 |  0 | 
+--------+--------+ 
1 row in set (0.25 sec) 

explain select * from toss where result != 1 limit 123456, 1; 
+----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+ 
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra  | 
+----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+ 
| 1 | SIMPLE  | toss | range | foo   | foo | 4  | NULL | 154565 | Using where | 
+----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+

现在您正在寻找更少的记录，但搜索时间从0.06秒增加到了0.25秒。为什么？由于顺序扫描索引实际上比顺序扫描表的效率低，对于给定键具有大量行的索引。

让我们看看在这个表上的索引：

show index from toss; 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| toss |   0 | PRIMARY |   1 | id   | A   |  500000 |  NULL | NULL |  | BTREE  |   | 
| toss |   1 | foo  |   1 | result  | A   |   2 |  NULL | NULL |  | BTREE  |   | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

主索引是一个很好的指标：有50万行数据，并有50万点的值。在BTREE中安排，您可以根据ID快速识别单个行。

foo索引是一个糟糕的索引：有500,000行，但只有2个可能的值。对于BTREE来说，这几乎是最糟糕的情况 - 所有搜索索引的开销，并且仍然需要搜索结果。

来源

2010-07-31 09:33:52

+1用详细解释一个例子！ – Incognito 2012-06-13 05:06:22

所以我们应该只使用索引，如果值有很多种（如主键 - 每行一个）。你建议OP在他的情况下应该做些什么？简单地删除索引，fin？ – Blauhirn 2016-08-04 17:59:28

@Blauhirn如果指数损害，没有帮助，那么是的，放弃指数。除此之外，您可能需要重新调整（非规范化）数据，以便更容易地选择有趣的数据，但这确实取决于问题中没有的详细信息。 – 2016-08-05 19:49:23

在没有order by条款，即开始在一些不确定的地方。

而根据你的解释，该指数甚至没有被使用。

来源

2010-07-30 22:43:38 tpdi

的输出，所以如果我添加'ORDER BY id'会被使用吗？ – Qiao 2010-07-30 22:46:27

为什么索引使这个查询更慢？

回答

相关问题