2014-02-20 173 views
3

如何设计一个正则表达式来捕获两个字符串之间的所有字符? 具体来说,从这个大的字符串:正则表达式返回两个字符串之间的所有字符

Studies have shown that...[^title=Fish consumption and incidence of stroke: a meta-analysis of cohort studies]... Another experiment demonstrated that... [^title=The second title]

我想[^title=]之间提取所有的字符,也就是说,Fish consumption and incidence of stroke: a meta-analysis of cohort studiesThe second title

我想我将不得不使用re.findall(),而且我可以这样开始:re.findall(r'\[([^]]*)\]', big_string),这会给我所有的方括号[ ]之间的比赛,但我不知道如何扩展它。

回答

5
>>> text = "Studies have shown that...[^title=Fish consumption and incidence of stroke: a meta-analysis of cohort studies]... Another experiment demonstrated that... [^title=The second title]" 
>>> re.findall(r"\[\^title=(.*?)\]", text) 
['Fish consumption and incidence of stroke: a meta-analysis of cohort studies', 'The second title'] 

这里是正则表达式的击穿:

\[是一个转义[字符。

\^是一个转义的^字符。

title=匹配标题=

(.*?)任何字符匹配,非贪婪地,并将它们在一组(为的findall提取)。这意味着它停止时,它发现一个...

\],这是一个逃脱]字符。

相关问题