下面是一个利用设施将聚合函数用作窗口函数的方法。聚合函数会保留数组中最后15分钟的观察值以及当前的运行总数。状态转换功能将元素从阵列中移出,落在15分钟的窗口后面,并推动最近的观察。最终的功能只是计算阵列中的平均温度。
现在,至于这是否是一种好处......取决于。它侧重于postgresql的plgpsql执行部分,而不是数据库访问部分,而我自己的经验是plpgsql不快。如果您可以轻松查找表格以查找每个观察结果的前15分钟行,则自行加入(如在@danihp答案中)将会很好。然而,这种方法可以处理来自更复杂的来源的观察结果,其中这些查找是不实际的。像以往一样,试用并在自己的系统上进行比较。
-- based on using this table definition
create table observation(id int primary key, timestamps timestamp not null unique,
temperature numeric(5,2) not null);
-- note that I'm reusing the table structure as a type for the state here
create type rollavg_state as (memory observation[], total numeric(5,2));
create function rollavg_func(state rollavg_state, next_in observation) returns rollavg_state immutable language plpgsql as $$
declare
cutoff timestamp;
i int;
updated_memory observation[];
begin
raise debug 'rollavg_func: state=%, next_in=%', state, next_in;
cutoff := next_in.timestamps - '15 minutes'::interval;
i := array_lower(state.memory, 1);
raise debug 'cutoff is %', cutoff;
while i <= array_upper(state.memory, 1) and state.memory[i].timestamps < cutoff loop
raise debug 'shifting %', state.memory[i].timestamps;
i := i + 1;
state.total := state.total - state.memory[i].temperature;
end loop;
state.memory := array_append(state.memory[i:array_upper(state.memory, 1)], next_in);
state.total := coalesce(state.total, 0) + next_in.temperature;
return state;
end
$$;
create function rollavg_output(state rollavg_state) returns float8 immutable language plpgsql as $$
begin
raise debug 'rollavg_output: state=% len=%', state, array_length(state.memory, 1);
if array_length(state.memory, 1) > 0 then
return state.total/array_length(state.memory, 1);
else
return null;
end if;
end
$$;
create aggregate rollavg(observation) (sfunc = rollavg_func, finalfunc = rollavg_output, stype = rollavg_state);
-- referring to just a table name means a tuple value of the row as a whole, whose type is the table type
-- the aggregate relies on inputs arriving in ascending timestamp order
select rollavg(observation) over (order by timestamps) from observation;
如果新的15分钟窗口启动,滚动平均值是否会“重新启动”?还是应该平均计算“最后”15分钟? –
@a_horse_with_no_name,实际上,数据集包含4周的历史数据,我需要移动平均结果作为新的数据集。 –
这不能回答我的问题。 –