2017-01-21 31 views
0
dfa = pd.DataFrame({'a':[1,2,3,4],'b':[4,5,7,6]}) 

列的预期输出过滤“有什么”,以在数据帧

a b 
0 1 4 
1 2 5 

我可以做到这一点通过以下方式

>>> dfa[(dfa.a == 1) | (dfa.a == 2)] 
    a b 
0 1 4 
1 2 5 

但是,这是不是真的可扩展的,因为我想做类似的事情

?? dfa[(dfa.a has-any range(5,50)) 
+0

我不知道,如果知道 - 就是我的回答是否正确? – jezrael

回答

1

我认为你需要boolean indexingisinnp.arangerange

print (np.arange(5,51)) 
[ 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50] 

print (dfa[dfa.a.isin(np.arange(5,51))]) 

或者:

print (dfa[dfa.a.isin(range(5,51))]) 

解决方案与between

print (dfa[dfa['a'].between(5, 50)]) 

样品(一个值更改为8):

dfa = pd.DataFrame({'a':[1,2,3,8],'b':[4,5,7,6]}) 
print (dfa) 
    a b 
0 1 4 
1 2 5 
2 3 7 
3 8 6 

print (dfa[dfa.a.isin(np.arange(5,51))]) 
    a b 
3 8 6 

print (dfa[dfa.a.isin(range(5,51))]) 
    a b 
3 8 6 

print (dfa[dfa['a'].between(5, 50)]) 
    a b 
3 8 6 
1

这也将做到:

import pandas as pd 
dfa = pd.DataFrame({'a':[1,2,3,4],'b':[4,5,7,6]}) 
print dfa['a'].between(5, 50).any() 
#False 
print dfa['b'].between(5, 50).any() 
#True 
print ((5 <= dfa) & (dfa <= 50)).any() # all columns together 
#a False 
#b  True 
#dtype: bool