2017-05-19 47 views
1

我有一个表格的如下所示的简单的数据集:在熊猫中,DataFrame如何通过为多个指定列的值指定的条件进行过滤?

import pandas as pd 

df = pd.DataFrame(
     [ 
      ["Norway"  , 7.537, 0.039, 11 , 31], 
      ["Denmark" , 7.522, -0.004, 9 , 12], 
      ["Switzerland", 7.494, None , 15 , 50], 
      ["Finland" , 7.469, None , None, 29], 
      ["Netherlands", 7.377, 1 , None, 77], 
     ], 
     columns = [ 
      "country", 
      "score A", 
      "score B", 
      "score C", 
      "score D" 
     ] 
    ) 

如何过滤此数据集,使得某些条件被放置在多行的值?那么,假设我想过滤数据,以便排除空值为score Bscore C的所有行(所有国家/地区)?这会导致排除Finland行。

当我尝试以下方法,我得到任何空值的所有行任一排除这些列,导致只有NorwayDenmark行被包括:

df[(df["score B"].notnull()) & (df["score C"].notnull())] 

如何才能做到这一点?

回答

1

您需要

df[~(df['score B'].isnull() & df['score C'].isnull())] 

    country  score A score B score C score D 
0 Norway  7.537 0.039 11.0 31 
1 Denmark  7.522 -0.004 9.0  12 
2 Switzerland 7.494 NaN  15.0 50 
4 Netherlands 7.377 1.000 NaN  77 
1

如何指定or

df[(df["score B"].notnull()) | (df["score C"].notnull())] 

输出:

 country score A score B score C score D 
0  Norway 7.537 0.039  11.0  31 
1  Denmark 7.522 -0.004  9.0  12 
2 Switzerland 7.494  NaN  15.0  50 
4 Netherlands 7.377 1.000  NaN  77 

,对吗?所有你想要的是排除两个都为空(或者我没有正确理解这个)的情况吗?

相关问题