2017-02-05 23 views
2
import pandas as pd 
olympics = pd.read_csv('olympics.csv') 

    Edition NOC Medal 
0  1896 AUT Silver 
1  1896 FRA Gold 
2  1896 GER Gold 
3  1900 HUN Bronze 
4  1900 GBR Gold 
5  1900 DEN Bronze 
6  1900 USA Gold 
7  1900 FRA Bronze 
8  1900 FRA Silver 
9  1900 USA Gold 
10  1900 FRA Silver 
11  1900 GBR Gold 
12  1900 SUI Silver 
13  1900 ZZX Gold 
14  1904 HUN Gold 
15  1904 USA Bronze 
16  1904 USA Gold 
17  1904 USA Silver 
18  1904 CAN Gold 
19  1904 USA Silver 

我能够枢转该数据帧到具有一些聚集体值Python的熊猫枢轴与值等于特定列的简单功能

pivot = olympics.pivot_table(index='Edition', columns='NOC', values='Medal', aggfunc='count') 

NOC  AUT CAN DEN FRA GBR GER HUN SUI USA ZZX 
Edition             
1896  1.0 NaN NaN 1.0 NaN 1.0 NaN NaN NaN NaN 
1900  NaN NaN 1.0 3.0 2.0 NaN 1.0 1.0 2.0 1.0 
1904  NaN 1.0 NaN NaN NaN NaN 1.0 NaN 4.0 NaN 

而不是具有奖牌在值=总数,我有兴趣有一个(#Gold,#Silver,#Bronze),(0,0,0)为NaN的元组(三元组)。

我该如何做到简洁而优雅?

无需使用pivot_table,为支点是元组完全没有了价值

回答

3
  • value_counts计算所有奖牌
  • 建立多指标的国家,日期的所有组合,奖牌
  • reindexfill_values=0

counts = df.groupby(['Edition', 'NOC']).Medal.value_counts() 

mux = pd.MultiIndex.from_product(
    [c.values for c in counts.index.levels], names=counts.index.names) 
counts = counts.reindex(mux, fill_value=0).unstack('Medal') 
counts = counts[['Bronze', 'Silver', 'Gold']] 

pd.Series([tuple(l) for l in counts.values.tolist()], counts.index).unstack() 

enter image description here