使用Pandas用单个值替换多个值

我有一个需要用单个值（Drive-by）替换的各种值的列表。我做了我的研究，但是我能找到的最接近的帖子是下面没有使用熊猫的附加链接。什么是最可行的方法来实现这一目标？使用Pandas用单个值替换多个值

fourth = pd.read_csv('C:/infocentertracker.csv') 
fourth = fourth.rename(columns={'Phone Number: ': 'Phone Number:'}) 
fourth['Source:'] = fourth['Source:'].replace('......', 'Drive-by') 

fourth.to_csv(.............) 

Drive By 
Drive-By 
Drive-by; Return Visitor 
Drive/LTX.com/Internes Srch     Replace all with Drive-by 
Driving By/Lantana Website 
Drive by 
Driving By/Return Visitor 
Drive by/Resident Referral 
Driving by 
Drive- by 
Driving by/LTX Website 
Driving By 
Driving by/Return Visitor 
Drive By/Return Visitor 
Drive By/LTX Website

来源

2017-02-10 Jake Wagner

是安全的假设，只有目标值从“Driv”开始？ – Marat

是的，这是安全的假设。 –

一种选择是下面为您请求的大熊猫方法：

fourth.ix[fourth['column name with values'].str.contains('driv', case=False, na=False), 'column name with values'] = 'Drive-by'

我宁愿使用正则表达式这不一定要求大熊猫：

import re 

[re.sub('(Driv.+)', 'Drive-by', i) for i in fourth['column name']]

来源

2017-02-10 14:31:11

谢谢，我得到一个错误... ValueError：无法使用包含NA/NaN值的向量索引 –

@Pythoner我在str.contains中添加了一个额外的参数，它是'na = False'。所有的本土熊猫功能。只是不确定你的数据是什么样子 –

非常感谢A.Kot。 –

您可以使用布尔值掩码str.startswith替换所有值开始s的Driv和想法是从comment of Marat：

df.loc[df.col.str.startswith('Driv'), 'col'] = 'Drive-by'

样品：

print (fourth) 
          col 
0      Drive By 
1      Drive-By 
2  Drive-by; Return Visitor 
3 Drive/LTX.com/Internes Srch 
4 Driving By/Lantana Website 
5      Drive by 
6  Driving By/Return Visitor 
7 Drive by/Resident Referral 
8     Driving by 
9      Drive- by 
10  Driving by/LTX Website 
11     Driving By 
12 Driving by/Return Visitor 
13  Drive By/Return Visitor 
14   Drive By/LTX Website 
15       aaa

fourth.loc[fourth['Source:'].str.startswith('Driv'), 'Source:'] = 'Drive-by' 
print (fourth) 
    Source: 
0 Drive-by 
1 Drive-by 
2 Drive-by 
3 Drive-by 
4 Drive-by 
5 Drive-by 
6 Drive-by 
7 Drive-by 
8 Drive-by 
9 Drive-by 
10 Drive-by 
11 Drive-by 
12 Drive-by 
13 Drive-by 
14 Drive-by 
15  aaa

与Series.mask另一种解决方案：

fourth['Source:']=fourth['Source:'].mask(fourth['Source:'].str.startswith('Driv', na=False), 
             'Drive-by') 
print (fourth) 
    Source: 
0 Drive-by 
1 Drive-by 
2 Drive-by 
3 Drive-by 
4 Drive-by 
5 Drive-by 
6 Drive-by 
7 Drive-by 
8 Drive-by 
9 Drive-by 
10 Drive-by 
11 Drive-by 
12 Drive-by 
13 Drive-by 
14 Drive-by 
15  aaa

来源

2017-02-10 14:32:32 jezrael

谢谢，对不起，如果这听起来很愚蠢，我试过fourth.loc ['driv'），'Source：'] ='驾车'，但它抛出了一个错误.... 。'DataFrame'对象没有属性'col' –

它是列nmae，我把它改为你的列名为'Source：' – jezrael

使用Pandas用单个值替换多个值

回答

相关问题