我试图调试生产速度慢,但在我的开发机器上速度很快的查询。我的开发箱有一个只有几天的prod数据库的快照,所以两个数据库的内容大致相同。Postgresql查询计划差异
查询是:
select count(*) from big_table where search_column in ('something')
注:
big_table
是snapshot materialized view约35M行,并每天search_column
刷新有一个B树索引。- PROD是9.1 ubuntu上
- dev为9.0在OS X
查询计划
的explain analyze
结果:
PROD:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1119843.20..1119843.21 rows=1 width=0) (actual time=467388.276..467388.278 rows=1 loops=1)
-> Bitmap Heap Scan on big_table (cost=10432.55..1118804.45 rows=415497 width=0) (actual time=116891.126..466949.331 rows=210053 loops=1)
Recheck Cond: ((search_column)::text = 'something'::text)
-> Bitmap Index Scan on big_table_search_column_index (cost=0.00..10328.68 rows=415497 width=0) (actual time=8467.901..8467.901 rows=337164 loops=1)
Index Cond: ((search_column)::text = 'something'::text)
Total runtime: 467389.534 ms
(6 rows)
dev:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=524011.38..524011.39 rows=1 width=0) (actual time=209.852..209.852 rows=1 loops=1)
-> Bitmap Heap Scan on big_table (cost=5131.43..523531.22 rows=192064 width=0) (actual time=33.792..194.730 rows=209551 loops=1)
Recheck Cond: ((search_column)::text = 'something'::text)
-> Bitmap Index Scan on big_table_search_column_index (cost=0.00..5083.42 rows=192064 width=0) (actual time=27.568..27.568 rows=209551 loops=1)
Index Cond: ((search_column)::text = 'something'::text)
Total runtime: 209.938 ms
(6 rows)
和两个查询的督促和开发的实际结果是210053个209551行,分别。
尽管两个计划的结构是相同的,但考虑到每个数据库中的表中的行数大致相同,哪些可能解释上述成本的差异?
鼓胀症
在@ BMA的建议,这里的 “膨胀” 查询的督促和开发以及相关的表/索引结果:
督促:
current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
------------------+------------+---------------------------------+--------+-------------+---------------------------------------------------------------+--------+--------------
my_db | public | big_table | 1.6 | 7965433856 | big_table_search_column_index | 0.1 | 0
dev:
current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
------------------+------------+---------------------------------+--------+-------------+---------------------------------------------------------------+--------+--------------
my_db | public | big_table | 0.8 | 0 | big_table_search_column_index | 0.1 | 0
Voila,这里有区别。
我运行了vacuum analyze big_table;
但似乎没有任何显着不同的计数查询的运行时间。
SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override');
结果如由BMA建议:
PROD:
name | current_setting | source
----------------------------+----------------------------------+----------------------
application_name | psql | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 6GB | configuration file
external_pid_file | /var/run/postgresql/9.1-main.pid | configuration file
listen_addresses | * | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | environment variable
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 2GB | configuration file
ssl | on | configuration file
TimeZone | localtime | environment variable
unix_socket_directory | /var/run/postgresql | configuration file
(15 rows)
dev的:
name | current_setting | source
----------------------------+-------------------------+----------------------
application_name | psql | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 4GB | configuration file
lc_messages | en_US | configuration file
lc_monetary | en_US | configuration file
lc_numeric | en_US | configuration file
lc_time | en_US | configuration file
listen_addresses | * | configuration file
log_destination | syslog | configuration file
log_directory | ../var | configuration file
log_filename | postgresql-%Y-%m-%d.log | configuration file
log_line_prefix | %t | configuration file
log_statement | all | configuration file
log_timezone | Australia/Hobart | command line
logging_collector | on | configuration file
maintenance_work_mem | 512MB | configuration file
max_connections | 50 | configuration file
max_stack_depth | 2MB | environment variable
shared_buffers | 2GB | configuration file
ssl | off | configuration file
synchronous_commit | off | configuration file
TimeZone | Australia/Hobart | command line
timezone_abbreviations | Default | command line
work_mem | 100MB | configuration file
(25 rows)
您是否在测试之前在生产数据库上运行“ANALYZE”?此外,开发版本可能处于最佳状态:数据在磁盘上都是连续的,生产(假设正常的UPDATE/DELETE用法)很可能不是。另外:检查膨胀:http://wiki.postgresql.org/wiki/Show_database_bloat,并显示服务器之间是否有任何配置设置不同: SELECT name,current_setting(name),source FROM pg_settings WHERE source NOT IN('default','override'); – bma
重新分析:有一个为数据库运行的autovacuum进程 - 如何检查是否运行ANALYZE? –
重新更新/删除 - 请参阅关于作为物化视图的表格的补充说明 - 我认为这意味着在两个环境中,连续性方面的结果会非常相似? –