2017-06-19 45 views
0

卷起事件为元数据我有一个看起来像如何从原始数据帧

Name,Report_ID,Amount,Flag,Actions 
Fizz,123,5,,A 
Fizz,123,10,Y,A 
Buzz,456,10,,B 
Buzz,456,40,,C 
Buzz,456,70,,D 
Bazz,678,100,Y,F 

从这些个体经营数据,我想创建一个新的数据帧捕获的各种统计数据/元的名字。主要是项目的总结和计数/唯一条目的计数。我想数据框的输出如下所示:

Report_ID,Number of Flags,Number of Entries, Total,Unique Actions 
123,1,2,15,1 
456,0,3,120,3 
678,1,1,100,1 

我用groupby试过,但我不能合并所有单独的分组的正确对象重新走到一起。到目前为止,我已经尝试

totals = raw_data.groupby('Report_ID')['Amount'].sum() 
event_count = raw_data.groupby('Report_ID').size() 
num_actions = raw_data.groupby('Report_ID').Actions.nunique() 

output = pd.concat([totals,event_count,num_actions]) 

当我尝试这个我得到TypeError: cannot concatenate a non-NDFrame object。任何帮助,将不胜感激!

回答

1

您可以在groupby

f = dict(Flag=['count', 'size'], Amount='sum', Actions='nunique') 
df.groupby('Report_ID').agg(f) 

      Flag  Amount Actions 
      count size sum nunique 
Report_ID       
123   1 2  15  1 
456   0 3 120  3 
678   1 1 100  1 
0

串联时,您只需要指定axis=1使用agg

event_count.name = 'Event Count' # Name the Series, as you did not group on one. 
>>> pd.concat([totals, event_count, num_actions], axis=1) 

      Amount Event Count Actions 
Report_ID        
123   15   2  1 
456   120   3  3 
678   100   1  1