下面是一个postgres查询,似乎比我预期的要长得多。 field_instances表在form_instance_id和field_id上都被索引,而form_instances表则在workflow_state上被索引。所以我认为这将是一个快速查询,但它需要永远。任何人都可以帮助我解释查询计划以及添加哪些索引来加速它?谢谢。如何优化这个postgresql查询?
explain analyze
select form_id,form_instance_id,answer,field_id
from form_instances,field_instances
where workflow_state = 'DRqueued'
and form_instance_id = form_instances.id
and field_id = 'Book_EstimatedDueDate';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=8733.85..95692.90 rows=9277 width=29) (actual time=2550.000..15430.000 rows=11431 loops=1)
Hash Cond: (field_instances.form_instance_id = form_instances.id)
-> Bitmap Heap Scan on field_instances (cost=2681.11..89071.72 rows=47567 width=25) (actual time=850.000..13690.000 rows=51726 loops=1)
Recheck Cond: ((field_id)::text = 'Book_EstimatedDueDate'::text)
-> Bitmap Index Scan on index_field_instances_on_field_id (cost=0.00..2669.22 rows=47567 width=0) (actual time=830.000..830.000 rows=51729 loops=1)
Index Cond: ((field_id)::text = 'Book_EstimatedDueDate'::text)
-> Hash (cost=5911.34..5911.34 rows=11312 width=8) (actual time=1590.000..1590.000 rows=11431 loops=1)
-> Bitmap Heap Scan on form_instances (cost=511.94..5911.34 rows=11312 width=8) (actual time=720.000..1570.000 rows=11431 loops=1)
Recheck Cond: ((workflow_state)::text = 'DRqueued'::text)
-> Bitmap Index Scan on index_form_instances_on_workflow_state (cost=0.00..509.11 rows=11312 width=0) (actual time=650.000..650.000 rows=11509 loops=1)
Index Cond: ((workflow_state)::text = 'DRqueued'::text)
Total runtime: 15430.000 ms
(12 rows)
你可以尝试像'设置ENABLE_HASHJOIN = 0;',看看是否有为您提供更快的计划。如果确实如此,那么我们将继续检查为什么该计划没有被首先使用。 – sayap
一些事情。这是什么版本的pg,并且您是否尝试了一下work_mem(比如说16MB左右)?哦,我们可以得到表格的模式,还是完全合格的列名称,当我不知道哪些列来自哪个表时,有点令人困惑。另外,你有没有尝试过使用显式连接语法? (即从一个加入b(ax =) –