2012-06-12 14 views
1

这是第一个表“TBL1”:如何使好索引的MySQL表加入有效

+---------+---------------------+------+-----+---------+----------------+ 
| Field | Type    | Null | Key | Default | Extra   | 
+---------+---------------------+------+-----+---------+----------------+ 
| val  | varchar(45)   | YES | MUL | NULL |    | 
| id  | bigint(20) unsigned | NO | PRI | NULL | auto_increment | 
+---------+---------------------+------+-----+---------+----------------+ 

凭借其索引:

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| tbl1 |   0 | PRIMARY |   1 | id   | A   | 201826018 |  NULL | NULL |  | BTREE  |   | 
| tbl1 |   1 | val  |   1 | val   | A   |  2147085 |  NULL | NULL | YES | BTREE  |   | 
| tbl1 |   1 | id_val |   1 | id   | A   | 201826018 |  NULL | NULL |  | BTREE  |   | 
| tbl1 |   1 | id_val |   2 | val   | A   | 201826018 |  NULL | NULL | YES | BTREE  |   | 
| tbl1 |   1 | val_id |   1 | val   | A   |  2147085 |  NULL | NULL | YES | BTREE  |   | 
| tbl1 |   1 | val_id |   2 | id   | A   | 201826018 |  NULL | NULL |  | BTREE  |   | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 

(之所以一些额外的索引是这样的:http://bit.ly/KWx1Xz

第二张表大致相同。下面是它的索引基数虽:

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | 
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| tbl2 |   0 | PRIMARY |   1 | id   | A   | 201826018 |  NULL | NULL |  | BTREE  |   | 
| tbl2 |   1 | val  |   1 | val   | A   |  881336 |  NULL | NULL | YES | BTREE  |   | 
| tbl2 |   1 | id_val |   1 | id   | A   | 201826018 |  NULL | NULL |  | BTREE  |   | 
| tbl2 |   1 | id_val |   2 | val   | A   | 201826018 |  NULL | NULL | YES | BTREE  |   | 
| tbl2 |   1 | val_id |   1 | val   | A   |  881336 |  NULL | NULL | YES | BTREE  |   | 
| tbl2 |   1 | val_id |   2 | id   | A   | 201826018 |  NULL | NULL |  | BTREE  |   | 
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 

的任务是内加入他们的VAL列,并获得ID的列表(和做,在1秒)。

这里是 '加入' 的做法:

SELECT tbl1.id FROM tbl1 JOIN tbl2 ON tbl1.val = 'iii' AND tbl2.val = 'iii' AND tbl1.id = tbl2.id; 

结果:10831行中集(55.15秒

查询说明:

+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+ 
| id | select_type | table | type | possible_keys     | key  | key_len | ref      | rows | Extra     | 
+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+ 
| 1 | SIMPLE  | tbl1 | ref | PRIMARY,val,id_val,val_id  | val_id | 138  | const      | 5160 | Using where; Using index | 
| 1 | SIMPLE  | tbl2 | eq_ref | PRIMARY,val,id_val,val_id  | PRIMARY | 8  | search_test.tbl1.id  | 1 | Using where    | 
+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+ 

这里是' in'方法:

SELECT id FROM tbl1 WHERE val = 'iii' and id IN (SELECT id FROM tbl2 WHERE val = 'iii'); 

结果:10831行中集(1分10.15秒

解释:

+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+ 
| id | select_type  | table | type   | possible_keys     | key  | key_len | ref | rows | Extra     | 
+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+ 
| 1 | PRIMARY   | tbl1 | ref    | val,val_id      | val_id | 138  | const | 8553 | Using where; Using index | 
| 2 | DEPENDENT SUBQUERY | tbl2 | unique_subquery | PRIMARY,val,id_val,val_id  | PRIMARY | 8  | func | 1 | Using where    | 
+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+ 

所以,这里是一个问题:如何调整这个查询让MySQL的完成它在第二?

+0

什么是一件艺术品! – Har

回答

2

好的我已经在每张桌上30,000+条记录上进行了测试,它运行得非常快。

因为它目前为你表演上两个巨大的表联接现,但如果您扫描的比赛上“VAL”在每张桌子上:第一,将大大减少你的加入集的大小。

我最初把这个答案作为一组子查询发布,但是我没有意识到MySQL在嵌套子查询中的速度很慢,因为它从外部执行。但是,如果将子查询定义为视图,它会从内部出。

所以,先创建视图。

CREATE VIEW tbl1_iii AS (
SELECT * FROM tbl1 WHERE val='iii' 
); 
CREATE VIEW tbl2_iii AS (
SELECT * FROM tbl2 WHERE val='iii' 
); 

然后运行查询。

SELECT tbl1_iii.id from tbl1_iii,tbl2_iii 
WHERE tbl1_iii.id = tbl2_iii.id; 

闪电。

+0

嵌套查询可能是一个不好的方法。 –

+0

哦,因为它是MySQL? – matchdav

+0

嗯。我刚才读到这个,似乎有两种方法可以强制MySQL首先执行内部查询:1.将子查询存储为视图,或者2.给整个子查询一个别名(使用AS) – matchdav

2
SELECT tbl1.id FROM tbl1 JOIN tbl2 ON tbl1.id = tbl2.id and tbl1.val = tbl2.val 
where tbl1.val = 'iii'; 
+0

谢谢,很快就会测试!但是,你能解释一下我的加入会有什么不同吗? –

+0

这是好的,但我的速度更快:) – matchdav

+0

SHL,你是在浪费时间与“三”的两倍,这意味着你被一遍又一遍地扫描表。 (发现TBL1和TBL2一排,其中tbl1.id = tbl2.id,现在检查tbl1.val =“三”,现在检查tbl2.val =“三”)。Srini的解决方案适用于(和我的作品),因为MySQL进程是从外部加入的。所以首先它会在tbl1上找到'iii'的所有匹配(这大大减少了要排序的记录数量),然后比较PK和外观'val'上的匹配。即tbl1上的表扫描,tbl1上的扫描(减少一次)&tbl2以匹配ID和VAL,即总共3次扫描。 – matchdav