如果执行以下数据库(postgres)查询,第二个调用要快得多。提高第一个查询的性能
我想第一个查询是慢的,因为操作系统(Linux)需要从磁盘获取数据。第二个查询受益于文件系统级别和postgres中的缓存。
是否有一种方法来优化数据库以快速获取第一个调用的结果?
首先呼叫(慢)
[email protected]:~$ psql
foo3_bar_p=# explain analyze SELECT "foo3_beleg"."id", ... FROM "foo3_beleg" WHERE
foo3_bar_p-# (("foo3_beleg"."id" IN (SELECT beleg_id FROM foo3_text where
foo3_bar_p(# content @@ 'footown'::tsquery)) AND "foo3_beleg"."belegart_id" IN
foo3_bar_p(# ('...', ...));
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=75314.58..121963.20 rows=152 width=135) (actual time=27253.451..88462.165 rows=11 loops=1)
-> HashAggregate (cost=75314.58..75366.87 rows=5229 width=4) (actual time=16087.345..16113.988 rows=17671 loops=1)
-> Bitmap Heap Scan on foo3_text (cost=273.72..75254.67 rows=23964 width=4) (actual time=327.653..16026.787 rows=27405 loops=1)
Recheck Cond: (content @@ '''footown'''::tsquery)
-> Bitmap Index Scan on foo3_text_content_idx (cost=0.00..267.73 rows=23964 width=0) (actual time=281.909..281.909 rows=27405 loops=1)
Index Cond: (content @@ '''footown'''::tsquery)
-> Index Scan using foo3_beleg_pkey on foo3_beleg (cost=0.00..8.90 rows=1 width=135) (actual time=4.092..4.092 rows=0 loops=17671)
Index Cond: (id = foo3_text.beleg_id)
Filter: ((belegart_id)::text = ANY ('{...
Rows Removed by Filter: 1
Total runtime: 88462.809 ms
(11 rows)
第二呼叫(快)的foo3_text表的
Nested Loop (cost=75314.58..121963.20 rows=152 width=135) (actual time=127.569..348.705 rows=11 loops=1)
-> HashAggregate (cost=75314.58..75366.87 rows=5229 width=4) (actual time=114.390..133.131 rows=17671 loops=1)
-> Bitmap Heap Scan on foo3_text (cost=273.72..75254.67 rows=23964 width=4) (actual time=11.961..97.943 rows=27405 loops=1)
Recheck Cond: (content @@ '''footown'''::tsquery)
-> Bitmap Index Scan on foo3_text_content_idx (cost=0.00..267.73 rows=23964 width=0) (actual time=9.226..9.226 rows=27405 loops=1)
Index Cond: (content @@ '''footown'''::tsquery)
-> Index Scan using foo3_beleg_pkey on foo3_beleg (cost=0.00..8.90 rows=1 width=135) (actual time=0.012..0.012 rows=0 loops=17671)
Index Cond: (id = foo3_text.beleg_id)
Filter: ((belegart_id)::text = ANY ('...
Rows Removed by Filter: 1
Total runtime: 348.833 ms
(11 rows)
表布局(28M行)
foo3_egs_p=# \d foo3_text
Table "public.foo3_text"
Column | Type | Modifiers
----------+-----------------------+------------------------------------------------------------
id | integer | not null default nextval('foo3_text_id_seq'::regclass)
beleg_id | integer | not null
index_id | character varying(32) | not null
value | text | not null
content | tsvector |
Indexes:
"foo3_text_pkey" PRIMARY KEY, btree (id)
"foo3_text_index_id_2685e3637668d5e7_uniq" UNIQUE CONSTRAINT, btree (index_id, beleg_id)
"foo3_text_beleg_id" btree (beleg_id)
"foo3_text_content_idx" gin (content)
"foo3_text_index_id" btree (index_id)
"foo3_text_index_id_like" btree (index_id varchar_pattern_ops)
Foreign-key constraints:
"beleg_id_refs_id_6e6d40770e71292" FOREIGN KEY (beleg_id) REFERENCES foo3_beleg(id) DEFERRABLE INITIALLY DEFERRED
"index_id_refs_name_341600137465c2f9" FOREIGN KEY (index_id) REFERENCES foo3_index(name) DEFERRABLE INITIALLY DEFERRED
硬件变化(SSD我而不是传统磁盘)或RAM磁盘是可能的。但也许现在的硬件也可以做更快的结果。
版本:9.1.2的PostgreSQL在x86_64-未知Linux的GNU
请发表评论,如果你需要更多的细节。
如果WHERE子句是永远不变的,有关定期发出请求是什么?保持数据热点是我知道的唯一避免第一次尝试的灵丹妙药。根据优化,你是否检查AND的选择性“foo3_beleg”。“belegart_id”。将它移动到第一个SELECT会变得敏感吗? – SCO 2014-11-25 16:17:56
@SCO否,WHERE子句不同。搜索词(这里是'footown')不同。 – guettli 2014-11-25 20:16:07
的Postgres 9.4将有'pg_prewarm'扩展,它可以填补需求缓冲区高速缓存:http://www.postgresql.org/docs/9.4/static/pgprewarm.html – 2014-11-27 15:46:47