熊猫数据框（选择）

我通过阅读CSV文件中创建数据帧，打印

<class 'pandas.core.frame.DataFrame'> 
    Int64Index: 176 entries, 0 to 175 
    Data columns (total 8 columns): 
    ID   176 non-null values 
    study   176 non-null values 
    center  176 non-null values 
    initials  176 non-null values 
    age   147 non-null values 
    sex   133 non-null values 
    lesion age 35 non-null values 
    group   35 non-null values 
    dtypes: float64(2), int64(1), object(5)

为什么给我一个错误，当我试图按照一定条件

SUBJECTS[SUBJECTS.study=='NO2' and SUBJECTS.center=='Hermann']

错误信息选择从数据帧行：

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

非常感谢您提前。

来源

2014-06-15 Hello lad

用途：（要么True或False）

SUBJECTS[(SUBJECTS.study=='NO2') & (SUBJECTS.center=='Hermann')]

的and导致Python来评估布尔上下文SUBJECTS.study=='NO2'和 SUBJECTS.center=='Hermann')

在你的情况，你不希望任何评估为布尔值。相反，你需要元素逻辑and。这由&而不是and指定。

的错误，每当你尝试评估在布尔上下文中的NumPy的阵列或熊猫NDFrame

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

提高。考虑

bool(np.array([True, False]))

一些用户可能会想到这回True因为数组是非空。或者有些人可能会预期True，因为至少有一个元素的阵列是True。其他人可能会期望它返回False，因为不是所有阵列中的元素都是True。由于对布尔上下文应该返回的内容有多个同样有效的期望，NumPy和Pandas的设计者决定强制用户明确：使用.all()或.any()或len()。

来源

2014-06-15 21:22:13 unutbu

欢迎来到SO。该错误是由于pandas框架下如何numpy功能，考虑到这些例子：

In [158]: 
a=np.array([1,2,1,1,1,1,2]) 
b=np.array([1,1,1,2,2,2,1]) 

In [159]: 
#Array Boolean operation 
a==1 
Out[159]: 
array([ True, False, True, True, True, True, False], dtype=bool) 

In [160]: 
#Array Boolean operation 
b==1 
Out[160]: 
array([ True, True, True, False, False, False, True], dtype=bool) 

In [161]: 
#and is not an array Boolean operation 
(a==1) and (b==1) 
--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-161-271ddf20f621> in <module>() 
----> 1 (a==1) and (b==1) 

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() 

In [162]: 
#But & operates on arrays 
(a==1) & (b==1) 
Out[162]: 
array([ True, False, True, False, False, False, False], dtype=bool) 

In [163]: 
#Or * 
(a==1) * (b==1) 
Out[163]: 
array([ True, False, True, False, False, False, False], dtype=bool) 

In [164]: 
df=pd.DataFrame({'a':a, 'b':b}) 
In [166]: 
#Therefore this is a good approach 
df[(df.a==1) & (df.b==1)] 
Out[166]: 
a b 
0 1 1 
2 1 1 
2 rows × 2 columns 

In [167]: 
#This will also get you there, but it is not preferred. 
df[df.a==1][df.b==1] 
C:\Anaconda\lib\site-packages\pandas\core\frame.py:1686: UserWarning: Boolean Series key will be reindexed to match DataFrame index. 
    "DataFrame index.", UserWarning) 
Out[167]: 
a b 
0 1 1 
2 1 1 
2 rows × 2 columns

来源

2014-06-15 21:29:38

非常感谢朱CT，我看过你所有的代码，它可以帮助我了解了很多:) @CT朱 –

熊猫数据框（选择）

回答

相关问题