我需要编写一个正则表达式来代替'.'
与','
在某些患者对药物的评论中。他们在提到副作用后应该使用逗号,但其中一些使用了点。例如:正则表达式用客户意见中的逗号替换一些点
text = "the drug side-effects are: night mare. nausea. night sweat. bad dream. dizziness. severe headache. I suffered. she suffered. she told I should change it."
我写一个正则表达式的代码来检测一个字(如,头痛)或两个单词(如,坏的梦)由两个点包围:
检测由包围的字两个点:
text= re.sub (r'(\.)(\s*\w+\s*\.)',r',\2 ', text)
检测两个词用两个点所包围:
text = re.sub (r'(\.)(\s*\w+\s\w+\s*\.)',r',\2 ', text11)
这是输出:
the drug side-effects are: night mare, nausea, night sweat. bad dream, dizziness, severe headache. I suffered, she suffered. she told I should change it.
但它应该是:night sweat to ','
后
the drug side-effects are: night mare, nausea, night sweat, bad dream, dizziness, severe headache. I suffered. she suffered. she told I should change it.
我的代码并没有取代dot
。另外,if a sentence starts with a subject pronoun (such as I and she) I do not want to change dot to comma after it, even if it has two words (such as, I suffered)
。我不知道如何将这个条件添加到我的代码中。
有什么建议吗?谢谢 !
请参阅https://regex101.com/r/awW1Hc/1,这是你想达到什么目的?你将不得不硬编码代词,没有办法。 –
@ Sebastian Proske,谢谢!完美的作品! – Mary