优化PSQL查询的执行时间

我第一次遇到query执行长时间的问题。问题实际上很大，因为查询在超过20秒内执行，对于端点用户来说这是非常明显的。优化PSQL查询的执行时间

我有相当大的数据库topics（〜8K），主题的有它的参数（这是dictionared - 我有8K主题113个不同的参数）。

我想显示有关这些主题的重复次数的报告。

topic table: 
----------------+---------+----------------------------------------------------- 
id    | integer | nextval('topic_id_seq'::regclass) 
topicengine_id | integer | 
description | text | 
topicparam_id | integer | 
date   | date | 

topicparam table: 
----------------+---------+---------------------------------------------------------- 
id    | integer | nextval('topicparam_id_seq'::regclass) 
name   | text |

和我的查询：结果

select distinct tp.id as tpid, tp.name as desc, (select count(*) from topic where topic.topicparam_id = tp.id) as count, t.date 
from topicparam tp, topic t where t.topicparam_id =tp.id 

Total runtime: 22372.699 ms

片段：

tpid |      topicname    | count | date 
------+---------------------------------------------+-------+--------- 
3823 | Topic1          |  6 | 2014-03-01 
3756 | Topic2          | 14 | 2014-03-01 
3803 | Topic3          | 28 | 2014-04-01 
3780 | Topic4          | 1373 | 2014-02-01

有什么办法，以优化执行时间，这个查询？

来源

2014-04-08 Mithrand1r

请张贴的输出'解释analyze'（或上传到http://explain.depesz.com）。还有哪些索引是在表格中定义的？你正在使用哪个精确的Postgres版本？ –

请阅读http://stackoverflow.com/tags/postgresql-performance/info，然后适当地编辑您的问题。 –

一个简单GROUP BY应该做同样的事情（如果我理解正确的查询

select tp.id as tpid, 
     max(tp.name) as desc, 
     count(*) as count, 
     max(t.date) as date 
from topicparam tp 
    join topic t on t.topicparam_id = tp.id 
group by tp.id;

BTW：。date是一列一个可怕的名字对于一个原因，因为它也是一个保留词，但更重要的是因为它没有记录该列包含的内容。“开始日期”，“结束日期”，“到期日期”，“记录日期”，“发布日期”，...？

来源

2014-04-08 06:11:22

tp.name上的max（）没有任何意义。如果有不同日期，但根据原始查询，max（）或min（）可能很有趣，可以获得第一个主题日期或最后一个。 – Ryx5

@ Ryx5：原始查询使用'distinct'，其中_seems_表示OP只需要*某些*独特的组合。它确实看起来像是试图获得团队的成就 - 但由于原始问题缺乏我必须猜测的大量必要信息。就像你在答案中所做的那样，它也可以是所有列上的“group by”。 –

对我来说DISTINCT + SUBQUERY正在杀死你formance。您应该使用GROUP BY两种方法来“解密”您的数据并“计数”。

SELECT 
    tp.id as tpid 
    , tp.name as description 
    , count(*) as numberOfTopics 
    , t.date 
FROM 
    topicparam tp 
    INNER JOIN topic t 
     ON t.topicparam_id = tp.id 
GROUP BY 
    tp.id 
    , tp.name 
    , t.date

考虑到大量的数据，你必须在索引注意：

在这种情况下，使用索引上topicparam.id和topic.id

上是从来没有使用join子句列删除索引。

尽量不要使用sql保留字，如“date，desc，count”作为别名或表字段。

来源

2014-04-08 06:13:16 Ryx5

你可以试试这个查询：

SELECT tp.id AS tpid, 
     tp.name AS DESC, 
     topic.cnt AS count, 
     t.date 
FROM topicparam tp 
JOIN topic t 
    ON t.topicparam_id =tp.id 
JOIN (SELECT topicparam_id, 
      count(*) cnt 
     FROM topic 
     GROUP BY topicparam_id) topic 
    ON topic.topicparam_id = tp.id 
GROUP BY tp.id, 
     tp.name, 
     t.date, 
     topic.cnt

来源

2014-04-08 06:15:34 Justin

优化PSQL查询的执行时间

回答

相关问题