2016-07-30 66 views
0

假设我有以下数据帧Q_df熊猫idxmax()未按预期

 (0, 0) (0, 1) (0, 2) (1, 0) (1, 1) (1, 2) (2, 0) (2, 1) (2, 2) 
(0, 0) 0.000 0.00  0.0 0.64 0.000  0.0 0.512 0.000  0.0 
(0, 1) 0.000 0.00  0.8 0.00 0.512  0.0 0.000 0.512  0.0 
(0, 2) 0.000 0.64  0.0 0.00 0.000  0.8 0.000 0.000  1.0 
(1, 0) 0.512 0.00  0.0 0.00 0.000  0.8 0.512 0.000  0.0 
(1, 1) 0.000 0.64  0.0 0.00 0.000  0.0 0.000 0.512  0.0 
(1, 2) 0.000 0.00  0.8 0.64 0.000  0.0 0.000 0.000  1.0 
(2, 0) 0.512 0.00  0.0 0.64 0.000  0.0 0.000 0.512  0.0 
(2, 1) 0.000 0.64  0.0 0.00 0.512  0.0 0.512 0.000  0.0 
(2, 2) 0.000 0.00  0.8 0.00 0.000  0.8 0.000 0.000  0.0 

这是使用以下代码生成:

import numpy as np 
import pandas as pd 

states = list(itertools.product(range(3), repeat=2)) 

Q = np.array([[0.000,0.000,0.000,0.640,0.000,0.000,0.512,0.000,0.000], 
[0.000,0.000,0.800,0.000,0.512,0.000,0.000,0.512,0.000], 
[0.000,0.640,0.000,0.000,0.000,0.800,0.000,0.000,1.000], 
[0.512,0.000,0.000,0.000,0.000,0.800,0.512,0.000,0.000], 
[0.000,0.640,0.000,0.000,0.000,0.000,0.000,0.512,0.000], 
[0.000,0.000,0.800,0.640,0.000,0.000,0.000,0.000,1.000], 
[0.512,0.000,0.000,0.640,0.000,0.000,0.000,0.512,0.000], 
[0.000,0.640,0.000,0.000,0.512,0.000,0.512,0.000,0.000], 
[0.000,0.000,0.800,0.000,0.000,0.800,0.000,0.000,0.000]]) 

Q_df = pd.DataFrame(index=states, columns=states, data=Q) 

对于Q的每一行,我想获取行中最大值对应的列名。如果我尝试

policy = Q_df.idxmax() 

然后将得到的系列看起来是这样的:

(0, 0) (1, 0) 
(0, 1) (0, 2) 
(0, 2) (0, 1) 
(1, 0) (0, 0) 
(1, 1) (0, 1) 
(1, 2) (0, 2) 
(2, 0) (0, 0) 
(2, 1) (0, 1) 
(2, 2) (0, 2) 

第一行看起来不错:第一行的最大因素是0.64和发生在(1,0)列。第二个也是。然而,对于第三行,最大元素为0.8,出现在列(1,2)中,因此我预计policy中的对应值为(1,2),而不是(0,1)

任何想法这里怎么了?

回答

1

IIUC,您可以在idxmax使用axis=1

policy = Q_df.idxmax(axis=1) 

(0, 0) (1, 0) 
(0, 1) (0, 2) 
(0, 2) (2, 2) 
(1, 0) (1, 2) 
(1, 1) (0, 1) 
(1, 2) (2, 2) 
(2, 0) (1, 0) 
(2, 1) (0, 1) 
(2, 2) (0, 2) 
dtype: object