2013-11-01 28 views
0

假设我有一个像BigQuery中的文档中提到的一个架构:BigQuery的SQL如果超过重复记录

Last modified     Schema     Total Rows Total Bytes Expiration 
----------------- ----------------------------------- ------------ ------------- ------------ 
    27 Sep 10:01:06 |- kind: string      4   794 
        |- fullName: string (required) 
        |- age: integer 
        |- gender: string 
        +- phoneNumber: record 
        | |- areaCode: integer 
        | |- number: integer 
        +- children: record (repeated) 
        | |- name: string 
        | |- gender: string 
        | |- age: integer 
        +- citiesLived: record (repeated) 
        | |- place: string 
        | +- yearsLived: integer (repeated) 

假设我们有fullNames:约翰,乔希,哈利

citiesLived:纽约,芝加哥,西雅图

如何迭代citiesLived并使用条件计数。例如,我想要计算有多少用户名为fullName = John的用户都住在城市Lived.place = newyork和citiesLived.place = chicago,但没有住在citiesLived.place = seattle。

感谢, 约翰

回答

7

可以使用时省略关键字。 (这是没有记录的,我会提交一个错误以确保它有记录)

SELECT COUNT(*) FROM (
    SELECT fullname, 
    IF (citiesLived.place == 'newyork', 1, 0) as ny, 
    IF (citiesLived.place == 'chicago', 1, 0) as chi 
    FROM (FLATTEN(name_table, citiesLived)) 
    OMIT RECORD IF citiesLived.place = 'seattle') 
WHERE fullname = 'John' 
    AND ny == 1 
    AND chi == 1