以下是部分测试方法。
它使用日期参数来确保where子句的一致性。其他参数也用于控制每小时桶(我在有限测试中使用了3)以及周数(我在测试中使用了0,因为我有一组非常小的行)。
第一个子查询用于生成“范围”,当连接到源行时,这些行将把这些行放入每个“滚动n小时范围”中。这些范围是通过使用date_format输出YYYYMMDDHH(它们是字符串)来定义的,然后数据也被强制为用于加入的相同字符串格式,所以如果在大型表格上使用,可能会导致性能问题(是的,不是可靠的,我不会也不喜欢它)。
该溶液可以看出工作here at SQL Fiddle
架构设置:
CREATE TABLE `myTable` (
`id` mediumint(8) unsigned NOT NULL auto_increment,
`start_time` datetime,
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1;
INSERT INTO MyTable
(`start_time`)
VALUES
('2017-08-01 00:01:00'),
('2017-08-01 00:15:00'),
('2017-08-01 00:29:00'),
## more here, 3 rows per hour over a narrow date range
('2017-08-03 08:01:00'),
('2017-08-03 08:15:00'),
('2017-08-03 08:29:00')
;
查询
set @start_time := '2017-08-02';
set @num_hrs := 4; -- controls length of rolling period e.g. 4 hours each
set @num_weeks := 4; -- controls the date date
set @end_time := date_add(@start_time, INTERVAL ((7 * @num_weeks)+1) DAY);
SELECT
DOW
, hour_of_day
, COUNT(*) period_count
, (COUNT(*) * 1.0)/@num_hrs rolling_av
FROM (
## build a set of ranges in YYYYMMDDHH format differing by the wanted number of hours
SELECT
id
, DATE_FORMAT(date_add(start_time, INTERVAL (@num_hrs*-1) HOUR), '%Y%m%d%H') as range_start
, DATE_FORMAT(start_time, '%Y%m%d%H') as range_end
FROM mytable
WHERE start_time >= @start_time and start_time < @end_time
) R
INNER JOIN (
SELECT
start_time
, DAYOFWEEK(start_time) as DOW
, date_format(start_time, '%H') as hour_of_day
FROM MyTable
WHERE start_time >= @start_time and start_time < @end_time
) T ON DATE_FORMAT(T.start_time, '%Y%m%d%H') >= R.range_start
AND DATE_FORMAT(T.start_time, '%Y%m%d%H') <= R.range_end
GROUP BY
DOW, hour_of_day
ORDER BY
DOW, hour_of_day
;
Results:
| DOW | hour_of_day | period_count | rolling_av |
|-----|-------------|--------------|------------|
| 4 | 00 | 36 | 12 |
| 4 | 01 | 36 | 12 |
| 4 | 02 | 36 | 12 |
| 4 | 03 | 36 | 12 |
| 4 | 04 | 36 | 12 |
| 4 | 05 | 36 | 12 |
| 4 | 06 | 36 | 12 |
| 4 | 07 | 36 | 12 |
| 4 | 08 | 36 | 12 |
| 4 | 09 | 36 | 12 |
| 4 | 10 | 36 | 12 |
| 4 | 11 | 36 | 12 |
| 4 | 12 | 36 | 12 |
| 4 | 13 | 36 | 12 |
| 4 | 14 | 36 | 12 |
| 4 | 15 | 36 | 12 |
| 4 | 16 | 36 | 12 |
| 4 | 17 | 36 | 12 |
| 4 | 18 | 36 | 12 |
| 4 | 19 | 36 | 12 |
| 4 | 20 | 36 | 12 |
| 4 | 21 | 27 | 9 |
| 4 | 22 | 18 | 6 |
| 4 | 23 | 9 | 3 |
请注意,'between'对日期/时间范围不利。你的where子句与此相当:'start_time> ='2017-08-01 00:00:00'和start_time <='2017-08-29 00:00:00''(不确定你是否想要整体或者使用'start_time> ='2017-08-01 00:00:00'和start_time <'2017-08-30 00:00:00''(这会给你8月29日的所有日子,没有从8月30日) –
对不起,额外的'00:00:00's'是不需要的。只是为了解释。看看关于同一主题的疯狂问题:https://stackoverflow.com/questions/16121023/calculating-a-moving-average-mysql –