2015-11-01 18 views
2

我有一个大查询表定义为:在大查询窗口函数和时间差

 
+----+----------------------------+------------+ 
| id |   time   | event | 
+----+----------------------------+------------+ 
| 1 | 2015-10-01 16:31:48.000000 | signup  | 
| 1 | 2015-10-01 16:41:48.000000 | 1_purchase | 
| 1 | 2015-10-01 16:51:48.000000 | 2_purchase | 
| 2 | 2015-10-01 16:31:48.000000 | signup  | 
| 2 | 2015-10-01 16:41:48.000000 | 1_purchase | 
| 3 | 2015-10-01 16:31:48.000000 | signup  | 
+----+----------------------------+------------+ 

我想计算每个ID组(1,2,3)内的时间差,从而获得结果作为:

 
+----+----------------------------+------------+-----------------+--+ 
| id |   time   | event | timedifference | | 
+----+----------------------------+------------+-----------------+--+ 
| 1 | 2015-10-01 16:31:48.000000 | signup  | -    | | 
| 1 | 2015-10-01 16:41:48.000000 | 1_purchase | 00:10:00.000000 | | 
| 1 | 2015-10-01 16:61:48.000000 | 2_purchase | 00:20:00.000000 | | 
| 2 | 2015-10-01 16:31:48.000000 | signup  | -    | | 
| 2 | 2015-10-01 16:41:48.000000 | 1_purchase | 00:10:00.000000 | | 
| 3 | 2015-10-01 16:31:48.000000 | signup  | no_purchase  | | 
+----+----------------------------+------------+-----------------+--+ 

经过一番研究,我想我需要使用窗口函数......但我找不出任何解决方案。 任何帮助,高度赞赏! 最佳, 五

回答

0
select 
    id, time, event, 
    time(sec_to_timestamp((timestamp_to_sec(timestamp(time)) -  
    timestamp_to_sec(timestamp(prev_time))))) as timedifference, 
    (timestamp_to_sec(timestamp(time)) -  
    timestamp_to_sec(timestamp(prev_time)))/60 as timefifference_in_min, 

    right('0' + string(datediff(timestamp(time),timestamp(prev_time))),2) + ' ' + 
    time(sec_to_timestamp((timestamp_to_sec(timestamp(time)) -  
    timestamp_to_sec(timestamp(prev_time))))) as timedifference_as_dd_hh_mm_ss 

from (
    select 
    id, time, event, 
    lag(time) over(partition by id order by time) as prev_time 
    from (
    select f0_ as id, f1_ as time, f2_ as event from 
    (select 1, '2015-10-01 16:31:48.000000', 'signup'), 
    (select 1, '2015-10-01 16:41:48.000000', '1_purchase'), 
    (select 1, '2015-10-01 16:51:48.000000', '2_purchase'), 
    (select 2, '2015-10-01 16:31:48.000000', 'signup'), 
    (select 2, '2015-10-01 16:41:48.000000', '1_purchase'), 
    (select 3, '2015-10-01 16:31:48.000000', 'signup') 
) 
) 
order by id, time 
+0

谢谢,从来没有使用滞后函数:) – xxxvinxxx

+0

但问题是...我得到的差异只是几个小时,它不考虑如果差异超过1天帐户... – xxxvinxxx

+0

在你的例子中,timedifference的格式让我认为你不会期望差异大于24小时。看起来这不是一个案例,所以我 在我的答案中添加了“timefifference_in_min”。如果你期望/需要一些特定的格式 - 请提供它的例子 –

1

是的,你可以使用分析窗口函数为 - 这里是一个办法做到这一点使用FIRST_VALUE解析函数:

SELECT id, time, event, (time - firsttime)/60000000 FROM (
SELECT id, time, event, 
     FIRST_VALUE(time) OVER(PARTITION BY id ORDER BY time) AS firsttime FROM 
(SELECT 1 id, TIMESTAMP('2015-10-01 16:31:48.000000') time, 'signup' event), 
(SELECT 1 id, TIMESTAMP('2015-10-01 16:41:48.000000') time, '1_purchase' event), 
(SELECT 1 id, TIMESTAMP('2015-10-01 16:51:48.000000') time, '2_purchase' event), 
(SELECT 2 id, TIMESTAMP('2015-10-01 16:31:48.000000') time, 'signup' event), 
(SELECT 2 id, TIMESTAMP('2015-10-01 16:41:48.000000') time, '1_purchase' event), 
(SELECT 3 id, TIMESTAMP('2015-10-01 16:31:48.000000') time, 'signup' event) 
) 
+0

谢谢!它的工作原理:) – xxxvinxxx