模式匹配与正则表达式返回None，而不应该

我正在学习正则表达式和美丽的汤，我正在做正则表达式的谷歌教程。我使用的谷歌教程网站（练习教程的设置部分中设置）提供的HTML文件模式匹配与正则表达式返回None，而不应该

的代码如下：

with open(filepath,"r") as f: soup = bs(f, 'lxml') 
soup.title

出

<title>Popular Baby Names</title>

代码：

h3 = soup.find_all("h3") # With find_all() I will capture the content of the <h3> Tags (In fact only one h3 Tag exists 
         # containing the Year) 

h3[0].get_text()

出

u'Popularity in 1990'

代码：

pattern = re.compile(r'.+(\d\d\d\d).+') 
string = h3[0].get_text() 
pattern.match(string).group(0)

出

AttributeError       Traceback (most recent call last) 
<ipython-input-61-2e4daef3292c> in <module>() 
----> 1 pattern.match(string).group(0) 

AttributeError: 'NoneType' object has no attribute 'group'

我无法解释为什么匹配（）没有捕捉到一年，因为它应该。

您的建议将不胜感激。

来源

2017-01-10 gk7

你的字符串以'1990'结尾，所以后面的'。+'不能匹配任何内容。 –

正如其他评论所述，你的正则表达式不起作用 - 你可以在这里测试：https：//regex101.com/r/d2NjKz/1 – ti7

可能重复[Python：从字符串中提取数字]（http：// stackoverflow.com/questions/4289331/python-extract-numbers-from-a-string） – ti7

因为它期望一年后至少有一个角色。试试。*而不是。+

来源

2017-01-10 21:24:44 palako

为什么它需要匹配'。*'？ –

'*'匹配零个或多个前一个字符，因此不需要更多字符来获得匹配。 – ti7

它没有，我假设他这样做了，因为他可能在年后想要一些东西，但是+需要至少一个字符。零或更多是*。 – palako

模式匹配与正则表达式返回None，而不应该

回答

相关问题