2016-03-09 40 views
0

我需要将Value列转换为城市分组的单行,并用“|”分隔。 (管)字符Apache Pig将行转换为以字符分隔的单列

DATA = LOAD '/tmp/test.dat' 使用PigStorage( '')作为( CITY:chararray, VALUE:chararray )

输入: (城市/值)

ISTANBUL,1

ISTANBUL,2

ISTANBUL,3

NEWYORK,8

NEWYORK,9

输出:

ISTANBUL,1 | 2 | 3

NEWYORK,8 | 9

回答

2

先做一组CITY,然后用BagToString(http://pig.apache.org/docs/r0.15.0/func.html#bagtostring)将每个组的值转换为所需的字符串表示形式。像(未经测试!)

data = LOAD '/tmp/test.dat' using PigStorage(',') AS (city:chararray, value:chararray); 
data_grp = GROUP data BY city; 
result = FOREACH data_grp GENERATE group AS city, BagToString(data.value, '|') AS values; 
+0

工程就像一个魅力!谢谢。 –

相关问题