2015-09-25 49 views
4

我想从基于时间戳的值列中提取第一个非空值。有人可以分享你的想法。谢谢。如何从Big Query中的一列值中获取第一个非空值?

到目前为止,我尝试了什么?

FIRST_VALUE(column) OVER (PARTITION BY id ORDER BY timestamp) 

Input :-

id,column,timestamp 
1,NULL,10:30 am 
1,NULL,10:31 am 
1,'xyz',10:32 am 
1,'def',10:33 am 
2,NULL,11:30 am 
2,'abc',11:31 am 

Output(expected) :- 
1,'xyz',10:30 am 
1,'xyz',10:31 am 
1,'xyz',10:32 am 
1,'xyz',10:33 am 
2,'abc',11:30 am 
2,'abc',11:31 am 
+0

最初的声明和您的示例输出似乎不一致。看起来你想用第一个非'NULL'值填充NULL值。 –

+0

不需要..我需要将第一个非空值作为col级别中所有值的输出。 – Teja

回答

2

据我所知,大查询像 'IGNORE NULLS' 或 'NULLS LAST' 任何选项。鉴于此,这是我能想到的最简单的解决方案。我希望看到更简单的解决方案。 假设输入数据为表“original_data”,

select w2.id, w1.column, w2.timestamp 
from 
(select id,column,timestamp 
    from 
    (select id,column,timestamp, row_number() 
        over (partition BY id ORDER BY timestamp) position 
     FROM original_data 
     where column is not null 
    ) 
    where position=1 
) w1 
right outer join 
original_data as w2 
on w1.id = w2.id 
+0

快速更新:现在支持使用“IGNORE NULLS”的可能性:https://cloud.google.com/bigquery/docs/release-notes#november_2_2017 – Sourygna

4

尝试字符串操作这个老把戏:

Select 
ID, 
    Column, 
    ttimestamp, 
    LTRIM(Right(CColumn,20)) as CColumn, 
    FROM 
(SELECT 
    ID, 
    Column, 
    ttimestamp, 
    MIN(Concat(RPAD(IF(Column is null, '9999999999999999',STRING(ttimestamp)),20,'0'),LPAD(Column,20,' '))) OVER (Partition by ID) CColumn 
FROM (

    SELECT 
    * 
    FROM (Select 1 as ID, STRING(NULL) as Column, 0.4375 as ttimestamp), 
     (Select 1 as ID, STRING(NULL) as Column, 0.438194444444444 as ttimestamp), 
     (Select 1 as ID, 'xyz' as Column, 0.438888888888889 as ttimestamp), 
     (Select 1 as ID, 'def' as Column, 0.439583333333333 as ttimestamp), 
     (Select 2 as ID, STRING(NULL) as Column, 0.479166666666667 as ttimestamp), 
     (Select 2 as ID, 'abc' as Column, 0.479861111111111 as ttimestamp) 
)) 
2

您可以修改你这样的SQL得到你想要的数据。

FIRST_VALUE(column) 
    OVER ( 
    PARTITION BY id 
    ORDER BY 
     CASE WHEN column IS NULL then 0 ELSE 1 END DESC, 
     timestamp 
) 
+0

MikeD确定此查询有效吗?我正在尝试这个,我得到错误信息:“在解析表达式中,ORDER BY必须引用命名列。找到CASE” – goRunToStack

0

SELECT标识,
(SELECT顶(1)从TEST1柱其中id = 1和列不为空,以便通过自动识别降序)作为名称 ,时间戳 FROM yourTable

输出: - 1,'xyz',10:30 am 1,'xyz',10:31 am 1,'xyz',10:32 am 1,'xyz',10:33 am 2'abc' ,11:30 am 2,'abc',11:31 am

相关问题