删除匹配的定界符之间的词/行

如何删除start和end之间的行，包括那些匹配词。删除匹配的定界符之间的词/行

line1 
line2 
start 
line3 
line4 
line5 
line6 
end 
line7 
line8

结果我想到的是： -

line1 
line2 
line7 
line8

我尝试下面的代码，但似乎没有奏效。

text = "line1\nline2\nstart\nline3\nline4\nline5\nline6\nend\nline7\nline8" 
print re.sub(r'start(.*)end', '', text)

来源

2013-10-05 sundar_ima

你将不得不使用修饰符re.DOTALL使(.*)匹配换行符：

re.sub(r'start(.*)end', '', text, flags=re.DOTALL)

然后，我认为这是安全使用懒(.*?)以防万一你碰到这样的：

line1\nstart\nline2\nline3\nend\nline4\nline5\nstart\nline6\nend\nline7

未经(.*?)将从第一start除去一切到最后end包括部分不在中间start和end之间：

re.sub(r'start.*?end', '', text, flags=re.DOTALL)

最后，我删除了括号，因为他们实际上并不需要在这里。

如果要删除所有剩下后面的空格，使用\s*修剪它们：

re.sub(r'start.*?end\s*', '', text, flags=re.DOTALL)

来源

2013-10-05 12:27:39 Jerry

+1懒惰的量词。 –

正确工作。杰里你又救了我。:-)） –

@sundar_ima不客气:)考虑[接受]（http://stackoverflow.com/help/accepted-answer）你认为为你工作的答案，以便将您的问题标记为已解决。 – Jerry

的.字符（默认设置）不匹配换行符。您需要通过设置re.DOTALL flag来启用该功能。

>>> text = "line1\nline2\nstart\nline3\nline4\nline5\nline6\nend\nline7\nline8" 
>>> print re.sub(r'start(.*)end', '', text, flags=re.DOTALL) 
line1 
line2 

line7 
line8

请注意，之间有一个空行;你需要包括换行符后end太：

>>> print re.sub(r'start(.*)end\n', '', text, flags=re.DOTALL) 
line1 
line2 
line7 
line8

至于.另外，您也可以使用两个对立的角色一起上课：

>>> print re.sub(r'start([\s\S]*)end\n', '', text) 
line1 
line2 
line7 
line8

这里\s和\S一起捕捉全部字符，包括换行符，没有设置DOTALL标志。

你可能想让你的比赛不贪心。如果你有套start和end线在你的输入，那么.*将从第一start一路的所有文本匹配到最后end：

>>> text = 'line1\nstart\nline2\nend\nline3\nstart\nline4\nend\nline5' 
>>> print text 
line1 
start 
line2 
end 
line3 
start 
line4 
end 
line5 
>>> print re.sub(r'start(.*)end\n', '', text, flags=re.DOTALL) 
line1 
line5

注line3是怎么没了。通过增加一个问号改变*，使其成为非贪婪：

>>> print re.sub(r'start(.*?)end\n', '', text, flags=re.DOTALL) 
line1 
line3 
line5

来源

2013-10-05 12:24:58

删除匹配的定界符之间的词/行

回答

相关问题