1
我想创建一个自定义排序的DataFrame。要做到这一点,我已经使用pandas.Categorical()
然而,如果我然后在一个组中使用这个结果NAN
返回值。为什么熊猫不允许在groupby中使用分类列?
# import the pandas module
import pandas as pd
# Create an example dataframe
raw_data = {'Date': ['2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13','2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13'],
'Portfolio': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B','B', 'B', 'B', 'C', 'C', 'C', 'C', 'C', 'C'],
'Duration': [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3],
'Yield': [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1],}
df = pd.DataFrame(raw_data, columns = ['Date', 'Portfolio', 'Duration', 'Yield'])
df['Portfolio'] = pd.Categorical(df['Portfolio'],['C', 'B', 'A'])
df=df.sort_values('Portfolio')
dfs = df.groupby(['Date','Portfolio'], as_index =False).sum()
print(dfs)
Date Portfolio Duration Yield
Date Portfolio
13/05/2016 C NaN NaN NaN NaN
B NaN NaN NaN NaN
A NaN NaN NaN NaN
这是为什么,我该如何克服这个问题?
另外SettingWithCopyWarning
是否有更好的Categorical成语?
这似乎涉及与其他“日期”列组合中的错误/使用'as_index = FALSE'(均只有通过投资组合,或与不使用as_index分组=假不工作)。您想在https://github.com/pydata/pandas/issues报告问题吗? – joris