为BigQuery的传统的SQL
SELECT
keyword,
COUNT(1) AS rows,
SUM(INTEGER((LENGTH(Text) - LENGTH(REPLACE(Text, keyword, '')))/LENGTH(keyword))) AS occurences
FROM YourTable
CROSS JOIN keywords
WHERE Text CONTAINS keyword
GROUP BY keyword
实施例与
SELECT
keyword,
COUNT(1) AS rows,
SUM(INTEGER((LENGTH(Text) - LENGTH(REPLACE(Text, keyword, '')))/LENGTH(keyword))) AS occurences
FROM (
SELECT Text FROM
(SELECT 'facebookfacebookcnnbbccnn' AS Text),
(SELECT 'facebook' AS Text),
(SELECT 'cnn' AS Text)
) AS words
CROSS JOIN (
SELECT keyword FROM
(SELECT 'facebook' AS keyword),
(SELECT 'cnn' AS keyword),
(SELECT 'bbc' AS keyword)
) AS keywords
WHERE Text CONTAINS keyword
GROUP BY keyword
对于大量查询标准SQL播放(见Enabling Standard SQL)
SELECT
keyword,
COUNT(1) AS `rows`,
SUM((LENGTH(Text) - LENGTH(REPLACE(Text, keyword, '')))/LENGTH(keyword)) AS occurences
FROM YourTable
JOIN keywords
ON STRPOS(Text, keyword) > 0
GROUP BY keyword
实施例与
打
“Text LIKE CONCAT('%',keyword,'%')”是危险的,因为关键字可能包含需要转义的特殊字符。这也不是很高效。在这里使用更好的函数将是“STRPOS(Text,keyword)> 0” –
同意,更新! –
这完美的作品!谢谢米哈伊尔。另外 - 有没有办法让这个查询扫描两列的关键字?例如A列:文本,B列:Text_2 –