2014-03-06 31 views
0

我是新来的猪,我试图分析JSON结构如下猪JsonLoader问题 - 不解析定制JSON正确

{"id1":197,"id2":[ 
    {"id3":"109.11.11.0","id4":"","id5":1391233948301}, 
    {"id3":"10.10.15.81","id4":"","id5":1313393100648}, 
    ... 
]} 

上述文件名为jsonfile.txt

alias = load 'jsonfile.txt' using JsonLoader('id1:int,id2:[id3:chararray,id4:chararray,id5:chararray]'); 

这是我得到的错误。

错误org.apache.pig.tools.grunt.Grunt - 错误1200:不匹配输入 'ID3' 期待RIGHT_BRACKET

你知道我可能是做错了什么?

+0

尝试检查整个JSON [这里](http://jsonlint.com/)。也许,这只是最后的逗号。 – kirilloid

+0

我刚刚检查了json的格式是否正确。 – user1386101

回答

1

您的JSON模式格式不正确。

为复杂的数据类型的格式如下所示:

Tuple: enclosed by(), items separated by "," 
    Non-empty tuple: (item1,item2,item3) 
    Empty tuple is valid:() 
Bag: enclosed by {}, tuples separated by "," 
    Non-empty bag: {code}{(tuple1),(tuple2),(tuple3)}{code} 
    Empty bag is valid: {} 
Map: enclosed by [], items separated by ",", key and value separated by "#" 
    Non-empty map: [key1#value1,key2#value2] 
    Empty map is valid: [] 

来源:http://pig.apache.org/docs/r0.10.0/func.html#jsonloadstore

换句话说,[]是不阵列,他们关联表(地图)关键字符是“#”来分割键和值。尝试使用元组(括号)代替。

'id1:int,id2:(id3:chararray,id4:chararray,id5:chararray)' 

OR

'id1:int,id2:{(id3:chararray,id4:chararray,id5:chararray)}' 

我无法测试它,从来没有试图猪,但根据文件,它应该只是罚款。

(基于以下的实施例)

a = load 'a.json' using JsonLoader('a0:int,a1:{(a10:int,a11:chararray)},a2:(a20:double,a21:bytearray),a3:[chararray]');