在熊猫中使用Groupby对象和重采样

我希望能够在数据框上使用groupby和resample来获取每年一次的字段计数。比方说，我有一个数据帧结构为这样：在熊猫中使用Groupby对象和重采样

df = pd.DataFrame({'year': {0: '2017', 1: '2018', 2: '2016', 3: '2018'}, 'month': {0: '1', 1: '2', 2: '3', 3: '4'}, 'day': {0: '1', 1: '1', 2: '1', 3: '3'}}) 
df['Date']=pd.to_datetime(df) 
#Sorry there is probably and easier way to set up the df 
df['B']=[1, 2, 3, 1] 
df['C']=[2,3,4, 1] 
df=df.ix[:, ['Date', 'B', 'C']] 

df.groupby('B').resample('A', on='Date')

如何按列B获得的代码的最后一行到组，仍然可以通过年份或月份，等重新取样？最后，我正在寻找按B分组的每年C计数。如果可能，我希望在过程中保持我的索引。谢谢。

来源

2017-10-20 Tyler Russell

可以GROUPBY列B和date.dt.year

df.groupby([df['Date'].dt.year, 'B']).C.count().reset_index() 

    Date B C 
0 2016 3 1 
1 2017 1 1 
2 2018 1 1 
3 2018 2 1

Opion 2使用石斑鱼

df.groupby([pd.Grouper(key = 'Date', freq='A'), 'B']).C.count().reset_index() 

    Date  B C 
0 2016-12-31 3 1 
1 2017-12-31 1 1 
2 2018-12-31 1 1 
3 2018-12-31 2 1

编辑：圆，有关使用重采样与GROUPBY的方式，但我不看不到，为什么会一个使用它

df.set_index('Date').groupby('B').resample('A').C.count().reset_index()

来源

2017-10-20 19:01:09 Vaishali

公平点。只是所以我知道，没有办法使用pd.resample函数？谢谢。 –

@TylerRussell，请参阅编辑以使用群组重复采样 – Vaishali

这是一种有用的方式来看待这两种方式。感谢你的帮助。 –

您可以使用resample但不建议

df.groupby('B').apply(lambda x : x.resample('A', on='Date').C.count()) 
Out[761]: 
B Date  
1 2017-12-31 1 
    2018-12-31 1 
2 2018-12-31 1 
3 2016-12-31 1 
Name: C, dtype: int64

来源

2017-10-20 19:14:04 Wen

检查我的编辑，而不适用:) – Vaishali

@Vaishali很好的解决方案upvoted – Wen

在熊猫中使用Groupby对象和重采样

回答

相关问题