0
我有一个关于要执行的查询的问题,但我不知道什么是最好的性能。我需要将所有单词排除在与表单词过滤器关联的单词之外。查询性能 - '左连接为空'vs'不存在选择'
查询的输出是正确的,但也许有更好的解决方案。我几乎没有关于查询计划的知识,现在我正试图理解它。
SELECT CONCAT(SPACE(1), UCASE(stocknews.word.word), SPACE(1)) AS word, stocknews.word.language
FROM stocknews.word
WHERE NOT EXISTS (SELECT word_id FROM stocknews.wordfilter WHERE stocknews.word.id = word_id)
AND user_id = 1
+----+--------------+------------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+----+--------------+------------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | PRIMARY | word | ref | user_id | user_id | 4 | const | 843 | Using where |
| 2 | MATERIALIZED | wordfilter | index | PRIMARY | PRIMARY | 756 | | 16 | Using index |
+----+--------------+------------+-------+---------------+---------+---------+-------+------+-------------+
反对
SELECT CONCAT(SPACE(1), UCASE(stocknews.word.word), SPACE(1)) AS word, stocknews.word.language
FROM stocknews.word
LEFT JOIN stocknews.wordfilter ON stocknews.word.id = stocknews.wordfilter.word_id
WHERE stocknews.wordfilter.word_id IS NULL AND user_id = 1
+----+-------------+------------+------+---------------+---------+---------+---------+------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+----+-------------+------------+------+---------------+---------+---------+---------+------+--------------------------------------+
| 1 | SIMPLE | word | ref | user_id | user_id | 4 | const | 843 | |
| 1 | SIMPLE | wordfilter | ref | PRIMARY | PRIMARY | 4 | word.id | 1 | Using where; Using index; Not exists |
+----+-------------+------------+------+---------------+---------+---------+---------+------+--------------------------------------+
欢迎任何帮助!解释会很好。
编辑:
对于查询1:
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 0 |
| Handler_icp_attempts | 0 |
| Handler_icp_match | 0 |
| Handler_mrr_init | 0 |
| Handler_mrr_key_refills | 0 |
| Handler_mrr_rowid_refills | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 1 |
| Handler_read_key | 1044 |
| Handler_read_last | 0 |
| Handler_read_next | 859 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_tmp_update | 0 |
| Handler_tmp_write | 215 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
25 rows in set (0.00 sec)
对于查询2:
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 0 |
| Handler_icp_attempts | 0 |
| Handler_icp_match | 0 |
| Handler_mrr_init | 0 |
| Handler_mrr_key_refills | 0 |
| Handler_mrr_rowid_refills | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 844 |
| Handler_read_last | 0 |
| Handler_read_next | 843 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_tmp_update | 0 |
| Handler_tmp_write | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
您是否尝试过并且比较了性能?一般来说(至少在SQL Server中,不确定MariaDB),如果你的表正确索引,'EXISTS'和'NOT EXISTS'执行得更快,因为它们是短路操作。一旦找到记录,就会根据您的查询将其排除或包含在内。 'LEFT JOIN'包括所有的记录,并且以'IS NULL'标准结尾进行过滤。 –
我在一个小数据集上尝试了这两种方法,两者都有相同的结果。感谢您提供的信息,我几乎可以确定我的索引是否正确设置。我想我需要制作一个更大的数据集来看看有什么不同。我的主要问题是听到查询使其更快或更慢。所以我猜'NOT EXISTS'方法更快。我将用更大的数据集对其进行测试,并报告实际查询的速度。谢谢 –
当_word_中有13.000条记录和_wordfilter_中有5000条记录时,'NOT EXISTS'方法似乎更快了@ –