2017-09-16 132 views
2

任何字符串我有以下格式CSV数据删除行:如何从大熊猫数据帧包含在特定列

+-------------+-------------+-------+ 
| Location | Num of Reps | Sales | 
+-------------+-------------+-------+ 
| 75894  |   3 | 12 | 
| Burkbank |   2 | 19 | 
| 75286  |   7 | 24 | 
| Carson City |   4 | 13 | 
| 27659  |   3 | 17 | 
+-------------+-------------+-------+ 

Location列是object数据类型。我想要做的是删除所有具有非数字位置标签的行。所以我的期望输出,考虑到上面的表格是:

list1 = ['Carson City ', 'Burbank']; 
df = df[~df['Location'].isin(['list1'])] 

这是由下面的帖子的启发:

+----------+-------------+-------+ 
| Location | Num of Reps | Sales | 
+----------+-------------+-------+ 
| 75894 |   3 | 12 | 
| 75286 |   7 | 24 | 
| 27659 |   3 | 17 | 
+----------+-------------+-------+ 

现在,我可能很难通过以下方式解决方案代码

How to drop rows from pandas data frame that contains a particular string in a particular column?

但是,我正在寻找的是一个通用的解决方案,它将适用于上述类型的任何表。

回答

2

您可以使用pd.to_numeric要挟非数值来nan,然后过滤器的基础上,如果位置nan

df[pd.to_numeric(df.Location, errors='coerce').notnull()] 

#Location Num of Reps Sales 
#0 75894   3  12 
#2 75286   7  24 
#4 27659   3  17 
3

或者你可以做

df[df['Location'].str.isnumeric()] 
 

    Location Num of Reps Sales 
0 75894   3  12 
2 75286   7  24 
4 27659   3  17 
1
In [139]: df[~df.Location.str.contains('\D')] 
Out[139]: 
    Location Num of Reps Sales 
0 75894   3  12 
2 75286   7  24 
4 27659   3  17 
0
df[df['Location'].str.isdigit()] 


    Location Num of Reps Sales 
0 75894   3  12 
2 75286   7  24 
4 27659   3  17