我认为你需要cut
的垃圾箱与crosstab
:
print (pd.cut(df['Value'], bins=[0, 5, 10], include_lowest=True))
0 [0, 5]
1 [0, 5]
2 (5, 10]
3 [0, 5]
4 [0, 5]
5 [0, 5]
Name: Value, dtype: category
Categories (2, object): [[0, 5] < (5, 10]]
df['rng'] = pd.cut(df['Value'], bins=[0, 5, 10],
labels=['range1','range2'], include_lowest=True)
df['State'] = df['rng'].astype(str) + '_' + df['State']
print (df)
Name State Value rng
0 nameA range1_state1 1 range1
1 nameA range1_state2 5 range1
2 nameA range2_state1 9 range2
3 nameA range1_state1 2 range1
4 nameB range1_state2 3 range1
5 nameB range1_state1 1 range1
df = pd.crosstab(df.Name, df.State)
print (df)
State range1_state1 range1_state2 range2_state1
Name
nameA 2 1 1
nameB 1 1 0
编辑:
您可以检查值,其中在此示例中是分档:
df1 = pd.DataFrame({'Value':np.arange(11)})
df1['bins'] = pd.cut(df1['Value'], bins=[0, 5, 10], include_lowest=True)
print (df1)
Value bins
0 0 [0, 5]
1 1 [0, 5]
2 2 [0, 5]
3 3 [0, 5]
4 4 [0, 5]
5 5 [0, 5]
6 6 (5, 10]
7 7 (5, 10]
8 8 (5, 10]
9 9 (5, 10]
10 10 (5, 10]
[欢迎](http://stackoverflow.com/tour)堆栈溢出。任何代码尝试?它使得回答更容易理解。参见[如何提出一个好问题](http://stackoverflow.com/help/how-to-ask) – Irfan