我有JSON格式的数据,其中嵌套数组。下面是一个例子:将嵌套数组加载到bigquery中
"data": {"events": [[1, 1271, 518, 945], [1, 1287, 495, 963],...
子阵列的长度可以是3或4,并且所述第一数量的数据类型(有大约30种不同的)。有没有办法将这些数据加载到bigQuery而不转换成字典的“记录”?
感谢, 亚龙
- 编辑 -
有this问题,即有一个解决办法,但有一个固定长度的子阵,所以不适用我想..
我有JSON格式的数据,其中嵌套数组。下面是一个例子:将嵌套数组加载到bigquery中
"data": {"events": [[1, 1271, 518, 945], [1, 1287, 495, 963],...
子阵列的长度可以是3或4,并且所述第一数量的数据类型(有大约30种不同的)。有没有办法将这些数据加载到bigQuery而不转换成字典的“记录”?
感谢, 亚龙
- 编辑 -
有this问题,即有一个解决办法,但有一个固定长度的子阵,所以不适用我想..
无法直接加载数组数组;您需要使用记录来包装数组的内层。标准SQL的参考涉及到这一点(尽管在语言本身方面,没有加载数据):https://cloud.google.com/bigquery/sql-reference/arrays#building-arrays-of-arrays。
感谢您的回答。我试图避免必须对数据进行转换才能将其转化为记录/结构。有没有办法上传这个嵌套数组作为字符串/文本blob? – WeaselFox
这个过去的问题可能会有所帮助:http://stackoverflow.com/questions/37660579/bigquery-create-column-of-json-datatype你可以将它加载为一个字符串,然后使用BigQuery的JSON函数来提取你想要的部分作为查询的一部分。 –
这可能是错误的方向,因为它是不完全清楚什么是你的最终目标,但让我尽量帮你
不知怎的,我觉得你的目标表有望成为类似下面
type metric1 metric2 metric3
1 1271 518 945
1 1287 495 963
所以,我的建议是让你的两个步骤
步骤1 - 只需一个字段加载数据作为CSV - 假设表theTable
与现场data
data
{"data": {"events": [[1, 1271, 518, 945], [1, 1287, 495, 963]]}}
{"data": {"events": [[2, 111, 222, 333], [3, 444, 555, 666], [4, 777, 888, 999]]}}
第二步 - 过程theTable
产生预期的架构(请参阅对答案的顶部)和保存到决赛桌。您可以使用下面的查询该
SELECT
NTH(1, SPLIT(y)) AS type,
NTH(2, SPLIT(y)) AS metric1,
NTH(3, SPLIT(y)) AS metric2,
NTH(4, SPLIT(y)) AS metric3,
FROM (
SELECT
REPLACE(REPLACE(COALESCE(y0, y1, y2, y3, y4, y5, y6), '[', ''), ']', '') AS y
FROM (
SELECT
IF(k=0, JSON_EXTRACT(data, '$.data.events[0]'), NULL) AS y0,
IF(k=1, JSON_EXTRACT(data, '$.data.events[1]'), NULL) AS y1,
IF(k=2, JSON_EXTRACT(data, '$.data.events[2]'), NULL) AS y2,
IF(k=3, JSON_EXTRACT(data, '$.data.events[3]'), NULL) AS y3,
IF(k=4, JSON_EXTRACT(data, '$.data.events[4]'), NULL) AS y4,
IF(k=5, JSON_EXTRACT(data, '$.data.events[5]'), NULL) AS y5,
IF(k=6, JSON_EXTRACT(data, '$.data.events[6]'), NULL) AS y6,
FROM theTable AS a
CROSS JOIN (
SELECT k FROM (SELECT 0 AS k), (SELECT 1 AS k), (SELECT 2 AS k),
(SELECT 3 AS k), (SELECT 4 AS k), (SELECT 5 AS k), (SELECT 6 AS k)
) AS b
)
HAVING NOT y IS NULL
)
其结果将是
type metric1 metric2 metric3
1 1271 518 945
1 1287 495 963
2 111 222 333
3 444 555 666
4 777 888 999
正如你所看到的 - 这个特定的查询多达7次阵列支持,但是你可以减少或改变增加此在三个地方
#1
REPLACE(REPLACE(COALESCE(y0, y1, y2, y3, y4, y5, y6), '[', ''), ']', '') AS y
#2
0码IF(k=0, JSON_EXTRACT(data, '$.data.events[0]'), NULL) AS y0,
IF(k=1, JSON_EXTRACT(data, '$.data.events[1]'), NULL) AS y1,
IF(k=2, JSON_EXTRACT(data, '$.data.events[2]'), NULL) AS y2,
IF(k=3, JSON_EXTRACT(data, '$.data.events[3]'), NULL) AS y3,
IF(k=4, JSON_EXTRACT(data, '$.data.events[4]'), NULL) AS y4,
IF(k=5, JSON_EXTRACT(data, '$.data.events[5]'), NULL) AS y5,
IF(k=6, JSON_EXTRACT(data, '$.data.events[6]'), NULL) AS y6,
#3
SELECT k FROM (SELECT 0 AS k), (SELECT 1 AS k), (SELECT 2 AS k),
(SELECT 3 AS k), (SELECT 4 AS k), (SELECT 5 AS k), (SELECT 6 AS k)
最后,测试只是转换逻辑,W/O负载的实际数据 - 您可以使用下面的脚本
SELECT
NTH(1, SPLIT(y)) AS type,
NTH(2, SPLIT(y)) AS metric1,
NTH(3, SPLIT(y)) AS metric2,
NTH(4, SPLIT(y)) AS metric3,
FROM (
SELECT
REPLACE(REPLACE(COALESCE(y0, y1, y2, y3, y4, y5, y6), '[', ''), ']', '') AS y
FROM (
SELECT
IF(k=0, JSON_EXTRACT(data, '$.data.events[0]'), NULL) AS y0,
IF(k=1, JSON_EXTRACT(data, '$.data.events[1]'), NULL) AS y1,
IF(k=2, JSON_EXTRACT(data, '$.data.events[2]'), NULL) AS y2,
IF(k=3, JSON_EXTRACT(data, '$.data.events[3]'), NULL) AS y3,
IF(k=4, JSON_EXTRACT(data, '$.data.events[4]'), NULL) AS y4,
IF(k=5, JSON_EXTRACT(data, '$.data.events[5]'), NULL) AS y5,
IF(k=6, JSON_EXTRACT(data, '$.data.events[6]'), NULL) AS y6,
FROM (
SELECT data FROM
(SELECT '{"data": {"events": [[1, 1271, 518, 945], [1, 1287, 495, 963]]}}' AS data),
(SELECT '{"data": {"events": [[2, 111, 222, 333], [3, 444, 555, 666], [4, 777, 888, 999]]}}' AS data)
) AS a
CROSS JOIN (
SELECT k FROM (SELECT 0 AS k), (SELECT 1 AS k), (SELECT 2 AS k),
(SELECT 3 AS k), (SELECT 4 AS k), (SELECT 5 AS k), (SELECT 6 AS k)
) AS b
)
HAVING NOT y IS NULL
)
希望这是有帮助的!
Upvote梦幻般的逻辑和努力 – BigDaddy
目前尚不清楚预期的最终表 - 举例说明! –
重要的是,您可以使用投票下方发布的答案左侧的勾号标记接受的答案。请参阅http://meta.stackexchange。com/questions/5234/how-does-accepting-an-answer-work#5235为什么它很重要。答案投票也很重要。表决有用的答案。还有更多......当某人回答你的问题时,你可以查看该怎么做 - http://stackoverflow.com/help/someone-answers。 –