2012-04-27 41 views
1

因此,可以说,我们有数据,看起来喜欢:汇总重叠事件随着MySQL的

drop table if exists views; 
create table views(id int primary key,start time,end time); 
insert into views values 
(1, '15:01', '15:04'), 
(2, '15:02', '15:09'), 
(3, '15:12', '15:15'), 
(4, '16:11', '16:23'), 
(5, '16:19', '16:25'), 
(6, '17:52', '17:59'), 
(7, '18:18', '18:22'), 
(8, '16:20', '16:22'), 
(9, '18:17', '18:23'); 

易于观察这样

1  |-----| 
2  |-----| 
3     |--| 
4      |-----| 
5       |-----| 
6         |---| 
7          |---| 
8       |---| 
9          |-----| 

现在我想,所以它看起来像这样

绘制一个数据
+---------------------------+ 
|    x   | 
| x  x xxx  xxx | 
| x xx xx x  xx x | 
+---------------------------+ 

基本上将它们分解成X长度段并总结每个X长度段的次数segm恩是感动的。有关如何创建此视图的任何想法?

(如果你一定要知道这一点,所以我可以为视频分析建立参与数据)

我不想输出为ASCII我希望它最终成为查询结果的SQL。喜欢的东西:

Time Start, Time End, Num_Views 
00:00, 00:05, 10 
00:06, 00:10, 3 
00:11, 00:15, 2 
00:16, 00:20, 8 
+1

你能更新您的问题,包括预期的输出,只是所以我很清楚你想要做什么? – weenoid 2012-04-27 08:38:47

+1

你想如何创建图形?或者我们在谈论ASCII艺术? – fancyPants 2012-04-27 09:40:48

+0

@tombom:不要认为这是必要的。 SQL将为您提供数据。表示层的工作是将它们可视化。 – 2012-04-27 09:55:45

回答

3

使用辅助号码表,你可以做这样的事情:

select 
    r.Time_Start, 
    r.Time_End, 
    sum(v.id is not null) as Num_Views 
from (
    select 
    cast(from_unixtime((m.minstart + n.n + 0) * 300) as time) as Time_Start, 
    cast(from_unixtime((m.minstart + n.n + 1) * 300) as time) as Time_End 
    from (
    select 
     unix_timestamp(date_format(minstart, '1970-01-01 %T')) div 300 as minstart, 
     unix_timestamp(date_format(maxend , '1970-01-01 %T')) div 300 as maxend 
    from (
     select 
     min(start) as minstart, 
     max(end ) as maxend 
     from views 
    ) s 
) m 
    cross join numbers n 
    where n.n between 0 and m.maxend - minstart 
) r 
    left join views v on v.start < r.Time_End and v.end > r.Time_Start 
group by 
    r.Time_Start, 
    r.Time_End 
; 

为您具体的例子此脚本生成以下输出:

Time_Start Time_End Num_Views 
---------- -------- --------- 
15:00:00 15:05:00 2 
15:05:00 15:10:00 1 
15:10:00 15:15:00 1 
15:15:00 15:20:00 0 
15:20:00 15:25:00 0 
15:25:00 15:30:00 0 
15:30:00 15:35:00 0 
15:35:00 15:40:00 0 
15:40:00 15:45:00 0 
15:45:00 15:50:00 0 
15:50:00 15:55:00 0 
15:55:00 16:00:00 0 
16:00:00 16:05:00 0 
16:05:00 16:10:00 0 
16:10:00 16:15:00 1 
16:15:00 16:20:00 2 
16:20:00 16:25:00 3 
16:25:00 16:30:00 0 
16:30:00 16:35:00 0 
16:35:00 16:40:00 0 
16:40:00 16:45:00 0 
16:45:00 16:50:00 0 
16:50:00 16:55:00 0 
16:55:00 17:00:00 0 
17:00:00 17:05:00 0 
17:05:00 17:10:00 0 
17:10:00 17:15:00 0 
17:15:00 17:20:00 0 
17:20:00 17:25:00 0 
17:25:00 17:30:00 0 
17:30:00 17:35:00 0 
17:35:00 17:40:00 0 
17:40:00 17:45:00 0 
17:45:00 17:50:00 0 
17:50:00 17:55:00 1 
17:55:00 18:00:00 1 
18:00:00 18:05:00 0 
18:05:00 18:10:00 0 
18:10:00 18:15:00 0 
18:15:00 18:20:00 2 
18:20:00 18:25:00 2 

一个数字表可能是临时表,尽管我建议您创建并初始化一个永久表,因为它对于很多目的可能有用。下面是初始化一个数字表的一种方式:

create table numbers (n int); 
insert into numbers (n) select 0; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
insert into numbers (n) select cnt + n from numbers, (select count(*) as cnt from numbers) s; 
/* repeat as necessary; every repeated line doubles the number of rows */ 

该脚本的“活”的版本,可以发现on SQL Fiddle

UPDATE(在所使用的方法的描述的尝试)

上述溶液实施以下步骤:

  1. 查找在views最早start时间和最新end时间表。

  2. 将这两个值转换为unix timestamps

  3. 将两个时间戳除以300,这实际上给了我们相应的5分钟范围(自Epoch以来)的索引。

  4. 在数字表的帮助下,生成一系列覆盖整个范围startend之间的5分钟范围。

  5. 对阵在views表事件时间范围列表(使用外连接,因为我们想(如果我们想要的)考虑到所有的范围)。

  6. 按范围界限对结果进行分组并计算组中的事件数量。(我刚刚注意到,在上面的查询中sum(v.id is not null)可以用更简洁的替代,在这种情况下,更自然count(v.id)。)