2014-09-06 52 views
0

查询A执行以微秒:为什么添加一个窗口函数使得这个查询很慢?

SELECT t1.id 
FROM (SELECT t0.id AS id FROM t0) AS t1 
WHERE NOT (EXISTS (SELECT 1 
     FROM t2 
     WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4)) 
LIMIT 20 OFFSET 0 

但查询乙大约需要25秒时:

SELECT t1.id, count(*) OVER() AS count 
FROM (SELECT t0.id AS id FROM t0) AS t1 
WHERE NOT (EXISTS (SELECT 1 
     FROM t2 
     WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4)) 
LIMIT 20 OFFSET 0 

(所不同的是在选择子句中只有一个项目 - 一个窗口集合)

EXPLAIN输出如下,对于A:

Limit (cost=0.00..1.20 rows=20 width=4) 
    -> Nested Loop Anti Join (cost=0.00..3449.22 rows=57287 width=4) 
     Join Filter: (t2.ph_id = t0.id) 
     -> Seq Scan on t0 (cost=0.00..1323.88 rows=57288 width=4) 
     -> Materialize (cost=0.00..1266.02 rows=1 width=4) 
       -> Seq Scan on t2 (cost=0.00..1266.01 rows=1 width=4) 
        Filter: ((me_id = 1) AND (rt_id = 4)) 

而对于B:

Limit (cost=0.00..1.45 rows=20 width=4) 
    -> WindowAgg (cost=0.00..4165.31 rows=57287 width=4) 
     -> Nested Loop Anti Join (cost=0.00..3449.22 rows=57287 width=4) 
       Join Filter: (t2.ph_id = t0.id) 
       -> Seq Scan on t0 (cost=0.00..1323.88 rows=57288 width=4) 
       -> Materialize (cost=0.00..1266.02 rows=1 width=4) 
        -> Seq Scan on t2 (cost=0.00..1266.01 rows=1 width=4) 
          Filter: ((me_id = 1) AND (rt_id = 4)) 

我加入窗口集合限制之前得到排的总数,为建设一个分页UI的目的。

回答

3

原始查询可以这样写:

SELECT t0.id 
FROM t0 
WHERE NOT EXISTS (SELECT 1 
        FROM t2 
        WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4 
       ) 
LIMIT 20 OFFSET 0; 

你没有order by,所以查询可以开始返回结果,因为他们发现结果集。当你添加窗口函数时:

SELECT t.0.id, count(*) over() 

现在它正在计算结果集中的行数,所以它必须生成整个结果集。因此,查询不必只是获得前20行,而是必须生成所有这些查询。这需要更多时间。

2

你可以检查长COUN(*)如何进行,哪些执行计划是这样的:

SELECT count(*) 
FROM (SELECT t0.id AS id FROM t0) AS t1 
WHERE NOT (EXISTS (SELECT 1 
     FROM t2 
     WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4)) 

为什么它需要更长的时间,这可能会给你的想法。

基本上,第一个查询只能读取20条与t0条件匹配的第一条记录,而第二条查询必须生成符合标准的完整记录集才能对它们进行计数。

0

感谢您的其他答案,这是正确的,计数必须做更多的工作,但我从另一个来源找到解决方案。统计信息不是最新的。

运行命令后...:

ANALYZE; 

... PostgreSQL的是能够选择更合适的查询计划,而现在这两个查询运行速度非常快。

相关问题