2015-05-14 117 views
4

我有用于传感器记录数据的应用程序,我希望能够以从多个传感器的平均值,可以是一个,两个,三个或大量...SQL平均值和零点

编辑:这些是温度传感器,因此0是传感器可能存储为数据库中值的值。

我最初的出发点是这样的SQL查询:

SELECT grid.t5||'.000000' as ts, 
avg(t.sensorvalue) sensorvalue1 
, avg(w.sensorvalue)AS sensorvalue2 
FROM 
(SELECT generate_series(min(date_trunc('hour', ts))       
,max(ts), interval '5 min') AS t5 FROM device_history_20865735 where  
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid 

LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min' 
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min' 
--WHERE t.sensorvalue notnull 
GROUP BY grid.t5 ORDER BY grid.t5 

我得到5点分钟的平均值,因为它是我的应用程序更好。

结果如预期具有用于任一sensorvalue1或2 NULL值:

ts;sensorvalue1;sensorvalue2 
"2015-05-13 09:00:00.000000";19.9300003051758; 
"2015-05-13 09:05:00.000000";20; 
"2015-05-13 09:10:00.000000";; 
"2015-05-13 09:15:00.000000";20.0599994659424; 
"2015-05-13 09:20:00.000000";; 
"2015-05-13 09:25:00.000000";20.1200008392334; 

我的目标是从所有可用的传感器计算,每次5分钟间隔的平均,从而空值是我想的问题使用CASE语句,这样如果有一个NULL,以获得其他传感器的值...

SELECT grid.t5||'.000000' as ts, 
CASE 
     WHEN avg(t.sensorvalue) ISNULL THEN avg(w.sensorvalue) 
     ELSE avg(t.sensorvalue) 
END AS sensorvalue 
, 
CASE 
     WHEN avg(w.sensorvalue) ISNULL THEN avg(t.sensorvalue) 
     ELSE avg(w.sensorvalue) 
END AS sensorvalue2 
FROM 
(SELECT generate_series(min(date_trunc('hour', ts)),max(ts), interval '5 min') AS t5 
FROM device_history_20865735 where  
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid 

LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min' 
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min' 
GROUP BY grid.t5 ORDER BY grid.t5 

但随后计算平均值我要做的另一选择在此之上每列数devide (又名传感器),如果他们只是两个,那就OK,但如果是因为可能有多个传感器每行有NULL值...

SQL是使用Postgres 9.4从应用程序(使用Python)语法派生的,所以有一个简单的方法为了实现我所需要的,我觉得我正在走上一条相当复杂的路线......?

编辑#2:有了您的输入我已经产生这个SQL代码,又似乎相当复杂,但开到你的想法和审查,如果它是可靠的,可维护:

SELECT ts, sensortotal, sensorcount, 
CASE 
    WHEN sensorcount = 0 THEN -1000 
    ELSE sensortotal/sensorcount 
END AS sensorAvg 

FROM (
    WITH grid as (
      SELECT t5 
      FROM (SELECT generate_series(min(date_trunc('hour', ts)), max(ts), interval '5 min') as t5 
       FROM device_history_20865735 
       ) d 
      WHERE t5 between '2015-05-13 09:00' and '2015-05-14 09:00' 
     ) 
    SELECT d1.t5 || '.000000' as ts 
      , Coalesce(avg(d1.sensorvalue), 0) + Coalesce(avg(d2.sensorvalue),0) as sensorTotal 
      , (CASE 
        WHEN avg(d1.sensorvalue) ISNULL THEN 0 
        ELSE 1 
      END + CASE 
      WHEN avg(d2.sensorvalue) ISNULL THEN 0 
      ELSE 1 
      END) as sensorCount 

    FROM (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
      FROM grid LEFT JOIN 
       device_history_20865735 t 
       ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
      GROUP BY grid.t5 
     ) d1 LEFT JOIN 
     (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
      FROM grid LEFT JOIN 
       device_history_493417852 t 
       ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
     GROUP BY grid.t5 
     ) d2 on d1.t5 = d2.t5 
    GROUP BY d1.t5 
    ORDER BY d1.t5 
) tmp; 

谢谢!

+0

我不知道如何计算平均值,但你可以做'(coalesce(sum(t.sensorvalue),0)+ coalesce(sum(w.sensorvalue),0))/ count(t.sensorvalue)+ count((w.sensorvalue))''。这可以很容易地扩展到任何数量的传感器。 – dnoeth

+0

谢谢@dnoeth!我需要在网格的每一行计算它,例如每5分钟,而不是整个列... – Kostas

回答

0

为了得到精确的平均值,则需要分别计算每一个之前联接:

WITH grid as (
     SELECT t5 
     FROM (SELECT generate_series(min(date_trunc('hour', ts)), max(ts), interval '5 min') as t5 
      FROM device_history_20865735 
      ) d 
     WHERE t5 between '2015-05-13 09:00' and '2015-05-14 09:00' 
    ) 
SELECT d1.t5 || '.000000' as ts, 
     avg(d1.sensorvalue) as sensorvalue1 
     , avg(d2.sensorvalue) as sensorvalue2 
FROM (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
     FROM grid LEFT JOIN 
      device_history_20865735 t 
      ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
     GROUP BY grid.t5 
    ) d1 LEFT JOIN 
    (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue 
     FROM grid LEFT JOIN 
      device_history_493417852 t 
      ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min' 
    GROUP BY grid.t5 
    ) d2 on d1.t5 = d2.t5 
GROUP BY d1.t5 
ORDER BY d1.t5; 
+0

谢谢@戈登! - 虽然我得到语法错误...错误:在“GROUP”处或附近的语法错误行21:GROUP BY d1.t5 ^ – Kostas

+0

在LEFT JOIN中d1和d2之间没有关系,ON条件丢失 –

+0

我已经设法让它运行,结果与上面的SQL完全相同......在平均值问题上,虽然有更“优雅”的解决方案吗? :) – Kostas

0

这听起来像你想是这样的:

(coalesce(value1,0) + coalesce(value2,0) + coalesce(value3,0))/
(value1 IS NOT NULL::int + value2 IS NOT NULL::int + value3 IS NOT NULL::int) 
AS average 

基本上,只是做你想为每一行做数学。唯一“棘手”的部分是如何“计数”非空值 - 我使用了一个强制转换,但还有其他选项,如:

CASE WHEN value1 IS NULL THEN 0 ELSE 1 END