PostgreSQL的高效的查询有与15M行表保存用户的收件箱数据带过滤器的过布尔
下面是简而言之慢查询:
SELECT *
FROM dialogs
WHERE user_id = 1234
AND deleted_at IS NULL
LIMIT 21
全面查询: (不相关的领域删除)
SELECT "dialogs"."id", "dialogs"."subject", "dialogs"."product_id", "dialogs"."user_id", "dialogs"."participant_id", "dialogs"."thread_id", "dialogs"."last_message_id", "dialogs"."last_message_at", "dialogs"."read_at", "dialogs"."deleted_at", "products"."id", ... , T4."id", ... , "messages"."id", ...,
FROM "dialogs"
LEFT OUTER JOIN "products" ON ("dialogs"."product_id" = "products"."id")
INNER JOIN "auth_user" T4 ON ("dialogs"."participant_id" = T4."id")
LEFT OUTER JOIN "messages" ON ("dialogs"."last_message_id" = "messages"."id")
WHERE ("dialogs"."deleted_at" IS NULL AND "dialogs"."user_id" = 9069)
ORDER BY "dialogs"."last_message_id" DESC
LIMIT 21;
说明:
Limit (cost=1.85..28061.24 rows=21 width=1693) (actual time=4.700..93087.871 rows=17 loops=1)
-> Nested Loop Left Join (cost=1.85..9707215.30 rows=7265 width=1693) (actual time=4.699..93087.861 rows=17 loops=1)
-> Nested Loop (cost=1.41..9647421.07 rows=7265 width=1457) (actual time=4.689..93062.481 rows=17 loops=1)
-> Nested Loop Left Join (cost=0.99..9611285.66 rows=7265 width=1115) (actual time=4.676..93062.292 rows=17 loops=1)
-> Index Scan Backward using dialogs_last_message_id on dialogs (cost=0.56..9554417.92 rows=7265 width=102) (actual time=4.629..93062.050 rows=17 loops=1)
Filter: ((deleted_at IS NULL) AND (user_id = 9069))
Rows Removed by Filter: 6852907
-> Index Scan using products_pkey on products (cost=0.43..7.82 rows=1 width=1013) (actual time=0.012..0.012 rows=1 loops=17)
Index Cond: (dialogs.product_id = id)
-> Index Scan using auth_user_pkey on auth_user t4 (cost=0.42..4.96 rows=1 width=342) (actual time=0.009..0.010 rows=1 loops=17)
Index Cond: (id = dialogs.participant_id)
-> Index Scan using messages_pkey on messages (cost=0.44..8.22 rows=1 width=236) (actual time=1.491..1.492 rows=1 loops=17)
Index Cond: (dialogs.last_message_id = id)
Total runtime: 93091.494 ms
(14 rows)
OFFSET
不使用- 有上
user_id
字段索引。 deleted_at
上的索引因为高选择性而未使用(90%的值实际上为NULL)。部分指数(... WHERE deleted_at IS NULL
)也无济于事。- 如果查询遇到很久以前创建的结果的一部分,它会变得特别慢。然后,查询必须筛选并放弃其间的数百万行。
索引列表:?
Indexes:
"dialogs_pkey" PRIMARY KEY, btree (id)
"dialogs_deleted_at_d57b320e_uniq" btree (deleted_at) WHERE deleted_at IS NULL
"dialogs_last_message_id" btree (last_message_id)
"dialogs_participant_id" btree (participant_id)
"dialogs_product_id" btree (product_id)
"dialogs_thread_id" btree (thread_id)
"dialogs_user_id" btree (user_id)
目前我正在考虑用适当的指数只查询最近的数据(即... WHERE last_message_at > <date 3-6 month ago>
(布林)
什么是速度的最佳实践up这样的查询?
如果您运行的解释仅使用'WHERE deleted_at IS NULL'查询您是否看到预期的速度?如果是这样,我建议在同一个索引中的'user_id'和'deleted_at'列上加上一个索引。通常这是必需的,因为您无法按照您想象的方式合并两个单独的索引,但是将索引存储在多个列中会产生更快的查询时间。 –
你说没有使用deleted_at上的索引。但你的解释显示它是,没有seq扫描。这是'dialogs_last_message_id'上的向后索引扫描。怎么了?粘贴完整的查询计划。 –
请发布您的索引定义。你是什么意思*部分索引不会帮助任何*? 'user_id'上的一个索引,其中'deleted_at IS NULL'应该有帮助。 – pozs