熊猫“只能比较相同标记的数据帧对象”错误

我用大熊猫的装入两个数据帧（UAT，PROD）两个文件的输出比较： ...熊猫“只能比较相同标记的数据帧对象”错误

uat = uat[['Customer Number','Product']] 
prod = prod[['Customer Number','Product']] 
print uat['Customer Number'] == prod['Customer Number'] 
print uat['Product'] == prod['Product'] 
print uat == prod 

The first two match exactly: 
74357 True 
74356 True 
Name: Customer Number, dtype: bool 
74357 True 
74356 True 
Name: Product, dtype: bool

对于第三次打印，我得到一个错误：只能比较标识相同的DataFrame对象。如果前两个比较好，第三个有什么问题？

感谢

来源

2013-08-31 user1804633

这里有一个小例子来证明这一点（只适用于DataFrames，不系列，直到熊猫0.19它适用于）：

In [1]: df1 = pd.DataFrame([[1, 2], [3, 4]]) 

In [2]: df2 = pd.DataFrame([[3, 4], [1, 2]], index=[1, 0]) 

In [3]: df1 == df2 
Exception: Can only compare identically-labeled DataFrame objects

一个解决方案是sort the index第一（注：some functions require sorted indexes）：

In [4]: df2.sort_index(inplace=True) 

In [5]: df1 == df2 
Out[5]: 
     0  1 
0 True True 
1 True True

注：==也sensitive to the order of columns，所以你可能必须使用sort_index(axis=1)：

In [11]: df1.sort_index().sort_index(axis=1) == df2.sort_index().sort_index(axis=1) 
Out[11]: 
     0  1 
0 True True 
1 True True

注：这仍然可以提高（如果索引/列不相同排序后标记）。

来源

2013-08-31 13:53:28

排序链接破碎。请纠正它。 –

@ShreyashSSarnayak谢谢，删除了对'sort'的引用，它现在从熊猫中删除了（只替换为'sort_index'）！ –

您也可以尝试删除索引列，如果不需要它来比较：

print(df1.reset_index(drop=True) == df2.reset_index(drop=True))

我在单元测试中使用同样的技术，像这样：

from pandas.util.testing import assert_frame_equal 

assert_frame_equal(actual.reset_index(drop=True), expected.reset_index(drop=True))

来源

2016-04-28 21:57:54 CoreDump

熊猫“只能比较相同标记的数据帧对象”错误

回答

相关问题