为什么PostgreSQL没有正确使用索引？

模式：为什么PostgreSQL没有正确使用索引？

create table records(
    id   varchar, 
    updated_at bigint 
); 
create index index1 on records (updated_at, id);

查询。它迭代最近更新的记录。获取10条记录，记得最后一条，然后取下10条记录等等。

select * from objects 
where updated_at > '1' or (updated_at = '1' and id > 'some-id') 
order by updated_at, id 
limit 10;

它采用了指数，但它并不明智地使用它，也适用于过滤和处理万吨的记录，请参阅下面的查询说明Rows Removed by Filter: 31575。

奇怪的是，如果您删除or并保留左侧或右侧状态 - 对两者都适用。但似乎如果不能找出如何正确应用索引，如果这两个条件与or同时使用。

Limit (cost=0.42..19.03 rows=20 width=1336) (actual time=542.475..542.501 rows=20 loops=1) 
    -> Index Scan using index1 on records (cost=0.42..426791.29 rows=458760 width=1336) (actual time=542.473..542.494 rows=20 loops=1) 
     Filter: ((updated_at > '1'::bigint) OR ((updated_at = '1'::bigint) AND ((id)::text > 'some-id'::text))) 
     Rows Removed by Filter: 31575 
Planning time: 0.180 ms 
Execution time: 542.532 ms 
(6 rows)

Postgres的版本是9.6

来源

2017-09-24 Alexey Petrushin

'...其中的updated_at> '1' ...'你不应该引用整数常量。 – wildplasser

@wildplasser我试过没有引号，同样的事情。 –

'width = 1336'这是一个*非常宽的表， – wildplasser

我会尝试这是两个单独的查询，其结果结合这样的：

select * 
from 
    (
    select * 
    from  objects 
    where updated_at > 1 
    order by updated_at, id 
    limit 10 
    union all 
    select * 
    from  objects 
    where updated_at = 1 
     and id > 'some-id' 
    order by updated_at, id 
    limit 10 
) t 
order by updated_at, id 
limit 10

我的猜测是，这两个查询将每个优化非常好，两者运行都会比现在更有效率。

如果可能的话，我也会让这些列不为NULL。

来源

2017-09-24 11:03:34

是的，我也想过。但我认为PostgreSQL足够聪明，也许在我的代码中有一些错误... –

是的，它解决了这个问题，谢谢。奇怪...我希望PostgreSQL更好... –

PostgreSQL对索引的调用有一个优化。

例如，给定的索引（A，B，C），并且其中一个 = 5和B> = 42和c < 77，该指数将必须从第一条目扫描的查询条件其中a = 5和b = 42直到 = 5的最后一个条目。具有c> = 77的索引条目将被跳过，但仍然必须被扫描。该索引原则上可以用于对b和/或c有约束的查询，但对整个索引将不得不进行扫描，因此在大多数情况下，规划器会优先使用顺序表扫描使用索引。

https://www.postgresql.org/docs/9.6/static/indexes-multicolumn.html

来源

2017-09-24 11:04:18 bilelovitch

为什么PostgreSQL没有正确使用索引？

回答

相关问题