2015-10-21 90 views

回答

41

您可以使用GROUPBY的size

In [11]: df.groupby(["Group", "Size"]).size() 
Out[11]: 
Group  Size 
Moderate Medium 1 
      Small  1 
Short  Small  2 
Tall  Large  1 
dtype: int64 

In [12]: df.groupby(["Group", "Size"]).size().reset_index(name="Time") 
Out[12]: 
     Group Size Time 
0 Moderate Medium  1 
1 Moderate Small  1 
2  Short Small  2 
3  Tall Large  1 
+0

感谢。根据频率(“时间”)选择前k(= 20)个值的小增加值:df.groupby([“Group”,“Size”])。size()。reset_index(name =“Time”) .sort_values(由= '时间',升序=假)。头(20); –

10

您也可以尝试pd.crosstab()

Group   Size 

Short   Small 
Short   Small 
Moderate  Medium 
Moderate  Small 
Tall   Large 

pd.crosstab(df.Group,df.Size) 


Size  Large Medium Small 
Group       
Moderate  0  1  1 
Short   0  0  2 
Tall   1  0  0 

编辑:为了让你出把

pd.crosstab(df.Group,df.Size).replace(0,np.nan).\ 
    stack().reset_index().rename(columns={0:'Time'}) 
Out[591]: 
     Group Size Time 
0 Moderate Medium 1.0 
1 Moderate Small 1.0 
2  Short Small 2.0 
3  Tall Large 1.0 
+1

不错。你甚至可以添加'边距=真'来获得边际计数! –

相关问题