2016-05-17 36 views
0

我与定义的模式有两个关系。我希望只找到关系A中不存在的记录(请参阅this post上的左中间可视化)。如何在Pig Latin中使用WHERE子句写入左外部连接?

我已经尝试了下面的两个变体,但都没有成功,因为它们都返回下面的错误。我如何在Pig中执行这种操作? “

”错误1200不匹配的输入“在哪里期待SEMI-COLON。”

join_result = JOIN relationA by (project_id, sequence_id) LEFT OUTER, relationB by (project_id, sequence_id) WHERE relationB (project_id, sequence_id)is null; 

join_result = JOIN relationA by (project_id, sequence_id) LEFT OUTER, relationB by (project_id, sequence_id) WHERE (relationB.project_id is null) AND (relationB.sequence_id is null); 

回答

1

没有“WHERE”在PIG.You JOIN一条必须使用过滤器消除基于一个条件的记录。

join_result = JOIN relationA by (project_id, sequence_id) LEFT OUTER, relationB by (project_id, sequence_id); 
final_result = FILTER join_result BY (relationB.project_id is null AND relationB.sequence_id is null);