2017-04-21 80 views
1

我有两个SQL查询:加入表导致谷歌的BigQuery

SELECT subreddit, count(subreddit) as count 
FROM [fh-bigquery:reddit_comments.all] 
where author="***********" GROUP by subreddit ORDER BY count DESC; 

SELECT subreddit, count(subreddit) as count 
FROM [redditcollaborativefiltering:aggregate_comments.reddit_posts_all] 
where author="***********" GROUP by subreddit ORDER BY count DESC; 

我希望能够加入这两个查询的结果为一个结果与同列因为两者都是相互结合在一起的。有没有简单的方法来做到这一点?

回答

1

对于BigQuery的传统SQL(我看到你在你的例子中使用),你可以在下面使用:

#legacySQL 
SELECT subredit, SUM(cnt) as cnt 
FROM (SELECT subreddit, COUNT(subreddit) as cnt 
     FROM [fh-bigquery:reddit_comments.all] 
     WHERE author = '***********' 
     GROUP BY subreddit 
    ), 
     (SELECT subreddit, COUNT(subreddit) as cnt 
     FROM [redditcollaborativefiltering:aggregate_comments.reddit_posts_all] 
     WHERE author = '***********' 
     GROUP by subreddit 
    ) 
GROUP BY subreddit 
ORDER BY cnt DESC 

,你可以请参阅此处 - Legacy SQL中的逗号用作UNION ALL

上面可以进一步简化

#legacySQL 
SELECT subreddit, COUNT(subreddit) as cnt 
FROM [fh-bigquery:reddit_comments.all], 
    [redditcollaborativefiltering:aggregate_comments.reddit_posts_all] 
WHERE author = '***********' 
GROUP BY subreddit 
ORDER BY cnt DESC 

你可以阅读更多关于Comma as UNION ALL可供BigQuery传统的SQL

1

您可以使用UNION ALL和另一个聚集:

SELECT subredit, SUM(cnt) as cnt 
FROM ((SELECT subreddit, count(subreddit) as cnt 
     FROM [fh-bigquery:reddit_comments.all] 
     WHERE author = '***********' 
     GROUP BY subreddit 
    ) UNION ALL 
     (SELECT subreddit, count(subreddit) as cnt 
     FROM [redditcollaborativefiltering:aggregate_comments.reddit_posts_all] 
     WHERE author = '***********' 
     GROUP by subreddit 
    ) 
    ) sc 
GROUP BY subreddit 
ORDER BY cnt DESC;