用python中的另一个字符串替换单词列表中的所有单词

我有一个用户输入的字符串，我想搜索它并用替换字符串替换任何出现的单词列表。用python中的另一个字符串替换单词列表中的所有单词

import re 

prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"] 


# word[1] contains the user entered message 
themessage = str(word[1])  
# would like to implement a foreach loop here but not sure how to do it in python 
for themessage in prohibitedwords: 
    themessage = re.sub(prohibitedWords, "(I'm an idiot)", themessage) 

print themessage

上面的代码不起作用，我敢肯定我不明白python for循环是如何工作的。

来源

2013-03-27 Zac

你应该尝试检查出的蟒蛇spambayes实现可能更具可扩展性。 – dusual 2013-03-27 12:18:01

你可以做到这一点与一个调用sub：

big_regex = re.compile('|'.join(map(re.escape, prohibitedWords))) 
the_message = big_regex.sub("repl-string", str(word[1]))

例子：

>>> import re 
>>> prohibitedWords = ['Some', 'Random', 'Words'] 
>>> big_regex = re.compile('|'.join(map(re.escape, prohibitedWords))) 
>>> the_message = big_regex.sub("<replaced>", 'this message contains Some really Random Words') 
>>> the_message 
'this message contains <replaced> really <replaced> <replaced>'

注意，使用str.replace可能导致微妙的错误：

>>> words = ['random', 'words'] 
>>> text = 'a sample message with random words' 
>>> for word in words: 
...  text = text.replace(word, 'swords') 
... 
>>> text 
'a sample message with sswords swords'

同时使用re.sub给出正确的结果：

>>> big_regex = re.compile('|'.join(map(re.escape, words))) 
>>> big_regex.sub("swords", 'a sample message with random words') 
'a sample message with swords swords'

由于thg435指出，如果要更换话不是每个子串，你可以添加单词边界的正则表达式：

big_regex = re.compile(r'\b%s\b' % r'\b|\b'.join(map(re.escape, words)))

这会取代'random''random words'而不是'pseudorandom words'。

来源

2013-03-27 12:03:13 Bakuriu

你可以显示一个运行 – 2013-03-27 12:03:51

但是，如果你有很多词要替换，你将不得不打破它。 – DSM 2013-03-27 12:15:18

您可能希望将您的表达式放在'\ b'中以避免替换“零售商”中的“tail”。 – georg 2013-03-27 12:31:30

试试这个：

prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"] 

themessage = str(word[1])  
for word in prohibitedwords: 
    themessage = themessage.replace(word, "(I'm an idiot)") 

print themessage

来源

2013-03-27 12:00:03

这很脆弱：正如Bakuriu解释的，当一个被禁止的单词是另一个的子串时，它很容易中断。 – Adam 2013-03-27 12:19:51

@codesparkle这并不意味着这是错误的，你总是选择你的选择取决于某些条件 – 2013-03-27 12:25:48

代码：

prohibitedWords =["MVGame","Kappa","DatSheffy","DansGame", 
        "BrainSlug","SwiftRage","Kreygasm", 
        "ArsonNoSexy","GingerPower","Poooound","TooSpicy"] 
themessage = 'Brain' 
self_criticism = '(I`m an idiot)' 
final_message = [i.replace(themessage, self_criticism) for i in prohibitedWords] 
print final_message

结果：

['MVGame', 'Kappa', 'DatSheffy', 'DansGame', '(I`m an idiot)Slug', 'SwiftRage', 
'Kreygasm', 'ArsonNoSexy', 'GingerPower', 'Poooound','TooSpicy']

来源

2013-03-27 12:45:30 zen11625

用python中的另一个字符串替换单词列表中的所有单词

回答

相关问题