我的查询的对象是从表a中获取所有行,其中性别= f和用户名不存在于表b中campid = xxxx。这里是我成功使用查询:MySQL加速左外部联接/检查空查询
SELECT `id`
FROM pool
LEFT JOIN sent
ON pool.username = sent.username
AND sent.campid = 'YA1LGfh9'
WHERE sent.username IS NULL
AND pool.gender = 'f'
的问题是,查询需要在9分钟内完成,池表包含超过1000万行,并且送出表最终要长得比更大那。我为许多列创建了索引,包括用户名和性别。但是,MySQL拒绝为此查询使用我的任何索引。我甚至尝试使用FORCE INDEX。下面是从游泳池我的指标,并说明我的查询的输出:
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| pool | 0 | PRIMARY | 1 | id | A | 9326880 | NULL | NULL | | BTREE | |
| pool | 1 | username | 1 | username | A | 9326880 | NULL | NULL | | BTREE | |
| pool | 1 | source | 1 | source | A | 6 | NULL | NULL | | BTREE | |
| pool | 1 | gender | 1 | gender | A | 9 | NULL | NULL | | BTREE | |
| pool | 1 | location | 1 | location | A | 59030 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.00 sec)
mysql> explain SELECT `id` FROM pool FORCE INDEX (username) LEFT JOIN sent ON pool.username = sent.username AND sent.campid = 'YA1LGfh9' WHERE sent.username IS NULL AND pool.gender = 'f';
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------------------+
| 1 | SIMPLE | pool | ALL | NULL | NULL | NULL | NULL | 9326881 | Using where |
| 1 | SIMPLE | sent | ALL | NULL | NULL | NULL | NULL | 351 | Using where; Not exists |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------------------+
2 rows in set (0.00 sec)
也,这里是我的发送表索引:
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| sent | 0 | PRIMARY | 1 | primary_key | A | 351 | NULL | NULL | | BTREE | |
| sent | 1 | username | 1 | username | A | 351 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
2 rows in set (0.00 sec)
你可以看到,在不使用任何索引所以我的查询需要很长时间。如果任何人有解决方案,涉及重新查询,请给我一个如何使用我的数据结构的例子,以便我不会有任何混淆如何实施和测试。谢谢。
我更喜欢'(性别,用户名,id)' –
@ypercube,好点...通过保持用户名位于第二位置将保持该索引不会反弹到发送的表,这也将以适当的顺序。我会改变它。 – DRapp
好的。我已经设置了一切符合你的规范(我认为),但我仍然有性能问题。事实上,它现在所花费的时间比我最初使用索引时的查询时间要长。这是我所做的:http://pastebin.com/BhyPPVqa查询花了将近13分钟完成。也许我做错了什么? – xendi