0
我输入表这种结构:与起始日期间填充天添加列日期和结束日期
acct_id pvt_data_id pvt_pref_ind start_dttm end_dttm load_dttm pr_load_time
4174878 26 Y 20101126144142 99991231235959 20170527000000 2017052700
4174878 26 Y 20101126144142 99991231235959 20170528000000 2017052800
4174878 26 Y 20101126144142 99991231235959 20170530000000 2017053000
3212472 26 X 20131016144142 99991231235959 20170531000000 2017053100
4174878 26 Y 20101126144142 99991231235959 20170601000000 2017060100
3212472 26 X 20091201142148 99991231235959 20170602000000 2017060200
林应该采取此表并创建一个新的额外的列pr_day
,这将有整数值一天(例如20170814
)的范围在start_dttm
和end_dttm
之间,因此该范围内每天都会有一行。
我开始与以下查询来获取对每个组(由第一3列)的范围
select
acct_id,
pvt_data_id,
pvt_pref_ind,
cast(min(substr(cast(start_dttm as string),1,8)) as bigint),
max(case when end_dttm=99991231235959 then cast(from_unixtime(unix_timestamp(now()),'yyyyMMdd') as bigint) when end_dttm is null then cast(from_unixtime(unix_timestamp(now()),'yyyyMMdd') as bigint) else end_dttm end)
from table1
group by acct_id, pvt_data_id,pvt_pref_ind
注意:值99991231235959或null表示当天应作为END_DATE。
现在我不知道如何继续,寻找指导我做一个交叉连接来填充日期,但我应该加入表中的什么?
所需的输出会是这样的:
acct_id pvt_data_id pvt_pref_ind start_dttm end_dttm load_dttm pr_load_time pr_day
4174878 26 Y 20101126144142 99991231235959 20170527000000 2017052700 20101126
4174878 26 Y 20101126144142 99991231235959 20170528000000 2017052800 20101127
4174878 26 Y 20101126144142 99991231235959 20170529000000 2017052900 20101128
4174878 26 Y 20101126144142 99991231235959 20170530000000 2017053000 20101129
3212472 26 X 20131016144142 99991231235959 20170531000000 2017053100 20091202
4174878 26 Y 20101126144142 99991231235959 20170601000000 2017060100 20101130
3212472 26 X 20091201142148 99991231235959 20170602000000 2017060200 20091201¨
感谢的提示和帮助。
感谢,我测试它并让你知道! –