我有用于传感器记录数据的应用程序,我希望能够以从多个传感器的平均值,可以是一个,两个,三个或大量...SQL平均值和零点
编辑:这些是温度传感器,因此0是传感器可能存储为数据库中值的值。
我最初的出发点是这样的SQL查询:
SELECT grid.t5||'.000000' as ts,
avg(t.sensorvalue) sensorvalue1
, avg(w.sensorvalue)AS sensorvalue2
FROM
(SELECT generate_series(min(date_trunc('hour', ts))
,max(ts), interval '5 min') AS t5 FROM device_history_20865735 where
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid
LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min'
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min'
--WHERE t.sensorvalue notnull
GROUP BY grid.t5 ORDER BY grid.t5
我得到5点分钟的平均值,因为它是我的应用程序更好。
结果如预期具有用于任一sensorvalue1或2 NULL值:
ts;sensorvalue1;sensorvalue2
"2015-05-13 09:00:00.000000";19.9300003051758;
"2015-05-13 09:05:00.000000";20;
"2015-05-13 09:10:00.000000";;
"2015-05-13 09:15:00.000000";20.0599994659424;
"2015-05-13 09:20:00.000000";;
"2015-05-13 09:25:00.000000";20.1200008392334;
我的目标是从所有可用的传感器计算,每次5分钟间隔的平均,从而空值是我想的问题使用CASE语句,这样如果有一个NULL,以获得其他传感器的值...
SELECT grid.t5||'.000000' as ts,
CASE
WHEN avg(t.sensorvalue) ISNULL THEN avg(w.sensorvalue)
ELSE avg(t.sensorvalue)
END AS sensorvalue
,
CASE
WHEN avg(w.sensorvalue) ISNULL THEN avg(t.sensorvalue)
ELSE avg(w.sensorvalue)
END AS sensorvalue2
FROM
(SELECT generate_series(min(date_trunc('hour', ts)),max(ts), interval '5 min') AS t5
FROM device_history_20865735 where
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid
LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min'
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min'
GROUP BY grid.t5 ORDER BY grid.t5
但随后计算平均值我要做的另一选择在此之上每列数devide (又名传感器),如果他们只是两个,那就OK,但如果是因为可能有多个传感器每行有NULL值...
SQL是使用Postgres 9.4从应用程序(使用Python)语法派生的,所以有一个简单的方法为了实现我所需要的,我觉得我正在走上一条相当复杂的路线......?
编辑#2:有了您的输入我已经产生这个SQL代码,又似乎相当复杂,但开到你的想法和审查,如果它是可靠的,可维护:
SELECT ts, sensortotal, sensorcount,
CASE
WHEN sensorcount = 0 THEN -1000
ELSE sensortotal/sensorcount
END AS sensorAvg
FROM (
WITH grid as (
SELECT t5
FROM (SELECT generate_series(min(date_trunc('hour', ts)), max(ts), interval '5 min') as t5
FROM device_history_20865735
) d
WHERE t5 between '2015-05-13 09:00' and '2015-05-14 09:00'
)
SELECT d1.t5 || '.000000' as ts
, Coalesce(avg(d1.sensorvalue), 0) + Coalesce(avg(d2.sensorvalue),0) as sensorTotal
, (CASE
WHEN avg(d1.sensorvalue) ISNULL THEN 0
ELSE 1
END + CASE
WHEN avg(d2.sensorvalue) ISNULL THEN 0
ELSE 1
END) as sensorCount
FROM (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue
FROM grid LEFT JOIN
device_history_20865735 t
ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min'
GROUP BY grid.t5
) d1 LEFT JOIN
(SELECT grid.t5, avg(t.sensorvalue) as sensorvalue
FROM grid LEFT JOIN
device_history_493417852 t
ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min'
GROUP BY grid.t5
) d2 on d1.t5 = d2.t5
GROUP BY d1.t5
ORDER BY d1.t5
) tmp;
谢谢!
我不知道如何计算平均值,但你可以做'(coalesce(sum(t.sensorvalue),0)+ coalesce(sum(w.sensorvalue),0))/ count(t.sensorvalue)+ count((w.sensorvalue))''。这可以很容易地扩展到任何数量的传感器。 – dnoeth
谢谢@dnoeth!我需要在网格的每一行计算它,例如每5分钟,而不是整个列... – Kostas