2015-11-13 35 views
1

如果我将以下函数硬编码到查询中,它的处理速度会快10倍......关于如何使函数快速运行的任何想法?PostgreSQL SQL函数比硬编码查询慢10倍

我认为编写sql函数的一个优点是查询规划器对这些函数完全有效,与PL语言函数相反。

顺便说一下,我使用的是PostgreSQL 9.4。


UPDATE

我现在已经意识到,执行速度差异并不来自把查询的功能,而是我如何调用该函数。

select * from spatial.aggregate_raster_stats_by_geom(155); >>1.5秒

select (spatial.aggregate_raster_stats_by_geom(155)).*; >>15秒


CREATE OR REPLACE FUNCTION spatial.aggregate_raster_stats_by_geom(
    IN arg_rid INTEGER 
) 
-- This function is called by the trigger that is fired whenever an entry is created in the raster catalog. 
RETURNS TABLE(band INTEGER,gid INTEGER, rid INTEGER, product_id INTEGER,ref_datetime TIMESTAMP ,scale INTEGER, count BIGINT, sum FLOAT, mean FLOAT, stddev FLOAT, min FLOAT, max FLOAT) AS 
$$ 
SELECT 
    band, 
    gid, 
    arg_rid as rid, 
    product_id, 
    ref_datetime, 
    scale, 
    (ST_SummaryStats(clip,band,TRUE)).* -- compute summary statistics (min and max, etc are also in there). TRUE indicates that nodata should be ignored. 
    FROM 
     (SELECT 
     gid, 
     ST_Union(ST_Clip(rast, geom)) as clip -- assemble the raster tiles and clip them with the assembled polygons 
     FROM 
     spatial.raster_tiles AS r 
     JOIN 
     spatial.geom_catalog AS polygons 
     ON ST_Intersects(rast,polygons.geom) -- only select raster tiles that touch the polygons. Spatial indexes should make this fast 
     JOIN 
     spatial.geom_metadata AS geometa 
     ON geometa.product_id = polygons.product_id 
     WHERE 
     geometa.aggregate_raster_auto = TRUE 
     AND r.rid=$1 

     GROUP by gid 
    ) as foo 
    cross join (
    -- Join bands to the selection 
    -- this join has to be introduced AFTER the clipping. If done before, then clipping will be performed for each band. 
     SELECT 
      generate_series(md.band_data_start,band_count) as band, 
      data_scale_factor as scale, 
      md.product_id, 
      rid, 
      ref_datetime 
     FROM spatial.raster_metadata md 
     JOIN spatial.raster_catalog rst ON rst.product_id = md.product_id 
     WHERE rst.rid = $1) AS bar2 
$$ 
    LANGUAGE sql IMMUTABLE; 
+1

@wildplasser 9.2和newer应该在参数绑定时自动专门化计划,所以这个问题已经大部分消失了。除非有某种原因,否则这里没有完成。 –

+1

@matthew请显示'explain(buffers,analyze)'独立运行。然后尝试通过SQL级别的'PREPARE'和'EXPLAIN ANALYZE EXECUTE'运行它。那么它慢吗?如果是这样,发布这两个计划。如果它仍然很快,那么启用'auto_explain'模块,并启用解释嵌套语句和分析模式,并获得嵌入到函数中的查询计划并发布。 –

+0

@CraigRinger感谢您的提示。我做了解释,但突然间,功能和独立查询一样快。然后我意识到我已经调用了不同的功能。有关更多详细信息,请参阅更新的问题 – Matthew

回答

0

阿。你忽略了关键细节。

在PostgreSQL(至少在9.5以上)写作:

SELECT (f()).*; 

...运行f一次为每个结果列!

它基本上是一个宏观扩展。

要解决这个问题,请将其包装在另一层子查询中。

+0

该函数返回一个表格 - 它应该使用'select * from function()'调用 –

+0

哇。这真的很好知道。它也意味着我的'(ST_SummaryStats(clip,band,TRUE))。*'不如我所希望的那么高效。 – Matthew

+0

相关链接:[http://big-elephants.com/2013-07/table-returning-functions/](http://big-elephants.com/2013-07/table-returning-functions/) – Matthew