2017-09-25 128 views
-1

我尝试根据两个条件(if,else)筛选我的熊猫数据框。只有if声明有效。 if语句基于2个条件(logic1logic2)保留(标记)我的DataFrame中的所有记录。参见第4节根据列值从熊猫数据框中排除记录

else发言,我想排除已标记(logic1logic2)所有的ID,而无需创建一个额外的列表或遍历每个记录。有没有办法将所有这些记录过滤掉而不将ID存储在额外的列表中?

我只想使用过滤器功能,如果可能的话。目前我从第3部分获得输出。它是错误的,因为id = 2已被标记,但仍包含在输出中。我需要的输出部4中显示。

代码

logic1 = (potatoes['Desc'] == 'Bla2') & (potatoes['Value'] == True) & (potatoes['Enabled'] == True) 
logic2 = (potatoes['Desc'].isin(['Bla8', 'Bla9'])) & (potatoes['Active'] == True) & (potatoes['Enabled'] == True) 

if flagged: 
    potatoes_flagged = potatoes[logic1 | logic2] 
    return potatoes_flagged 
else: 
    potatoes_not_flagged = potatoes[~logic1 & ~logic2] 
    return potatoes_not_flagged 

1.输入(马铃薯)

id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS] 
1 | Bla1 | 1  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
2 | Bla2 | 1  | 1  | 1  | [A LOT OF OTHER COLUMNS] 
2 | Bla3 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
2 | Bla4 | 0  | 0  | 0  | [A LOT OF OTHER COLUMNS] 
2 | Bla5 | 0  | 0  | 0  | [A LOT OF OTHER COLUMNS] 
3 | Bla6 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
4 | Bla7 | 0  | 0  | 1  | [A LOT OF OTHER COLUMNS] 

2.输出为标记(如果)(CORRECT )

id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS] 
2 | Bla2 | 1  | 1  | 1  | [A LOT OF OTHER COLUMNS] 

3.输出不标记(否则)(错)

id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS] 
1 | Bla1 | 1  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
2 | Bla3 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
2 | Bla4 | 0  | 0  | 0  | [A LOT OF OTHER COLUMNS] 
2 | Bla5 | 0  | 0  | 0  | [A LOT OF OTHER COLUMNS] 
3 | Bla6 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
4 | Bla7 | 0  | 0  | 1  | [A LOT OF OTHER COLUMNS] 

4.输出需要未标记(正确)

id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS] 
1 | Bla1 | 1  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
3 | Bla6 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
4 | Bla7 | 0  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
+0

我想你需要'土豆[〜(logic1&logic2)]',但是你的期望输出是错误的? –

+0

,因为根据您的数据,id = 2永远不会在此标记。 –

+0

我不明白为什么'id = 2'永远不会被标记,并且'〜logic1&〜logic2'似乎是正确的。 – orangetacos

回答

1

它看起来像你想找到所有id s不会由potatoes[logic1 | logic2]返回。你可以使用一个倒置的isin调用来做到这一点。

idx_flagged = potatoes.loc[logic1 | logic2, 'id'].values 
potatoes[~potatoes.id.isin(idx_flagged)] 

    id Desc Active Enabled Value 
0 1 Bla1  1  0  1 
5 3 Bla6  1  1  0 
6 4 Bla7  0  0  1 
+1

谢谢你的回答。这回答了我的问题。 – orangetacos