2014-03-04 57 views
0

我有一个包含具有字符串数据类型的列的熊猫表。我需要的是从数据框中排除任何具有“未找到”的行作为其中的字符串。我目前正在:基于字符串值排除熊猫行

DF [df.some_column = “未找到”!],但不工作

期待回复。

的样本数据:

card_number effective_date expiry_date grouping_name  Ac. Year code 
0  1206090 28 Sep 2012 21 Aug 2013 Dummy no.1 201213 
1  1206090 21 Feb 2013 21 Aug 2013 Dummy no.2 201213 
2  1206090 28 Sep 2012 30 Nov 2012 Dummy no.3 201213 
3  1206090 03 Dec 2012 21 Aug 2013 Dummy no.3 201213 
4  1206090 23 Apr 2013 31 Aug 2013 Dummy no.4 201213 
5  1206090 28 Sep 2012 21 Aug 2013 Dummy no.5 201213 
6  1206090 28 Sep 2012 21 Aug 2013 Dummy no.6 201213 
7  1206090 24 Oct 2012 07 Aug 2013  Not found 201213 
8  1206090 08 Jan 2013 08 Jan 2013  Not found 201213 
9  1206090 08 Jan 2013 31 Aug 2013  Not found 201213 
10 Not found 03 Jul 2013 21 Aug 2013 Dummy no.1 201213 
11 Not found 03 Jul 2013 21 Aug 2013 Dummy no.2 201213 

额外注:我的字符串匹配必须非常怪异......当DF [grouping_name]运行=“未找到”我真得为7,8,9 .. 。有谁知道为什么?

+0

你需要使用'str.contains';请参阅[这里](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.strings.StringMethods.contains.html#pandas.core.strings.StringMethods.contains);如'df.some_column.str.contains('Not Found',na = False,regex = False)' –

+0

TypeError:contains()得到了一个意想不到的关键字参数'regex'...除此之外,只会找到列值...我想要列没有这样的值 –

+0

添加'〜'到开头并且放下'regex = False';我想这是添加在'13.1'; –

回答

1

尝试:

df[df['some_column'] != "Not found"] 

解决方案通过样本数据:

df = pd.read_csv("data.csv") 
df 

    card_number effective_date expiry_date grouping_name Ac. Year code 
0 1206090  28 Sep 2012  21 Aug 2013  Dummy no.1 201213 
1 1206090  21 Feb 2013  21 Aug 2013  Dummy no.2 201213 
2 1206090  28 Sep 2012  30 Nov 2012  Dummy no.3 201213 
3 1206090  03 Dec 2012  21 Aug 2013  Dummy no.3 201213 
4 1206090  23 Apr 2013  31 Aug 2013  Dummy no.4 201213 
5 1206090  28 Sep 2012  21 Aug 2013  Dummy no.5 201213 
6 1206090  28 Sep 2012  21 Aug 2013  Dummy no.6 201213 
7 1206090  24 Oct 2012  07 Aug 2013  Not found 201213 
8 1206090  08 Jan 2013  08 Jan 2013  Not found 201213 
9 1206090  08 Jan 2013  31 Aug 2013  Not found 201213 
10 Not found 03 Jul 2013  21 Aug 2013  Dummy no.1 201213 
11 Not found 03 Jul 2013  21 Aug 2013  Dummy no.2 201213 


df[df['grouping_name'] != 'Not found'] 

card_number effective_date expiry_date grouping_name Ac. Year code 
0 1206090  28 Sep 2012  21 Aug 2013  Dummy no.1 201213 
1 1206090  21 Feb 2013  21 Aug 2013  Dummy no.2 201213 
2 1206090  28 Sep 2012  30 Nov 2012  Dummy no.3 201213 
3 1206090  03 Dec 2012  21 Aug 2013  Dummy no.3 201213 
4 1206090  23 Apr 2013  31 Aug 2013  Dummy no.4 201213 
5 1206090  28 Sep 2012  21 Aug 2013  Dummy no.5 201213 
6 1206090  28 Sep 2012  21 Aug 2013  Dummy no.6 201213 
10 Not found 03 Jul 2013  21 Aug 2013  Dummy no.1 201213 
11 Not found 03 Jul 2013  21 Aug 2013  Dummy no.2 201213 
+0

不幸运 –

+0

你能提供样品数据吗? – Amit

+0

你去了哪里... grouping_name不能没有找到它。 –