2017-04-17 34 views
0

我有一张桌子,上面写着人的名字,他去哪里购物。我想找出每个超市名称的最大出现次数。如何统计猪群中的智慧

例如,在下面的文件中,如果Alan前往Costco购物的最大次数,则输出应该有他的店名和店名以及他去过那里的次数。我需要为下面给出的文件中的所有人找到这个计数。

Alan Costco 
Ryan Walmart 
Jim Costco 
Steve WholeFoods 
Ryan WholeFoods 
Jim Walmart 
Alan Costco 
Ryan Walmart 
Jim Costco 
Steve WholeFoods 
Ryan WholeFoods 
Jim Walmart 
Alan Costco 
Ryan Walmart 
Jim Costco 
Steve WholeFoods 
Ryan WholeFoods 
Jim Walmart 
Alan Costco 
Ryan Walmart 
Jim Costco 
Steve WholeFoods 
Ryan WholeFoods 
Jim Walmart 
Alan Costco 
Ryan Walmart 
Jim Costco 
Steve WholeFoods 
Ryan WholeFoods 
Jim Walmart 
Alan Walmart 
Jim WholeFoods 
Ryan Costco 
Steve Walmart 

回答

0
shopdata = LOAD 'path_of_filename' USING PigStorage(',') as (name:bytearray,mall:bytearray); 

groupdata = group shopdata by (name,mall); 

reqdata = foreach groupdata generate group.name as customer,group.mall as shopping_mall,COUNT(shopdata.mall); 

dump reqdata; 
0

看到这里的解释上COUNT

A = LOAD 'file_path/test36.txt' USING PigStorage(' ') AS (a1 : chararray, a2 : chararray); 
B = GROUP A BY (a1,a2); 
C = FOREACH B GENERATE group,COUNT(A.a2) AS Total; 
DUMP C; 

enter image description here