1
这两级pig
处理工程:如何将两条猪语句合并为一个?
my_out = foreach (group my_in by id) {
grouped = BagGroup(my_in.(keyword,weight),my_in.keyword);
generate
group as id,
CountEach(my_in.domain) as domains,
grouped as grouped;
};
my_out1 = foreach my_out {
keywords = foreach grouped generate group as keyword, SUM($1.weight) as weight;
generate id, domains, keywords;
};
然而,当我将它们合并:
my_out = foreach (foreach (group my_in by id) {
grouped = BagGroup(my_in.(keyword,weight),my_in.keyword);
generate
group as id,
CountEach(my_in.domain) as domains,
grouped as grouped;
}) {
keywords = foreach grouped generate group as keyword, SUM($1.weight) as weight;
generate id, domains, keywords;
};
我得到一个错误:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " <IDENTIFIER> "generate "" at line 1, column 5.
我的问题是:
- 如何避免此错误?
- 它甚至有道理我正在尝试做什么? 即使我设法做到这一点,这将节省我一个MR通行证?
我得到'ERROR 1000:解析时出错。词汇错误在第25行第0列。遇到:之后:“”你的代码 –
sds
Darn。那么你可能会倒霉。但请放心,它不会添加任何map-reduce作业来将语句拆分。 –