preg_match_all函数为我提供了可能的第一个159个结果261

我希望有人知道，什么是错的。我做了一个语法分析器来获得所有的标签。preg_match_all函数为我提供了可能的第一个159个结果261

<a href="blabla">Link</a>

标签。我在http://www.bbc.co.uk/上测试它。在我测试的页面上有261个，我只收到了159个。我手动检查了它，发现它们中的每一个，但是我的结果数组只有159个元素。这个限制的原因是什么？

preg_match_all('/<a\s[^\>]*href\=[\'"]?((?:http\:\/\/)?(?:[_\-a-zA-Z0-9\.]*[_a-zA-Z0-9\.\/]))*[\'"]/', $page, $matches);

我查了一下，卷曲给了我所有的页面从

<html>

直到

</html>

的问题是使解析器没有任何DOM使用，只需卷曲和正则表达式。

2013-10-31 Bandydan

什么标签？你想匹配什么？ –

你读过这个：http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags？ – sectus

在您的页面中包含更多详细信息，例如所有标签信息。 –

OK，我设法通过加入一些字符我正则表达式来解决这个问题：

preg_match_all('/<a\s*[^\>]*href\s*\=\s*[\'"]?((?:http\:\/\/)?(?:[_\-a-zA-Z0-9\.]*[\?\=\&_a-zA-Z0-9\.\/]))*[\'"]/', $page, $matches);

我添加了一些空格符号，如“=”，“&”和“？”在链接的主体中被授予。

2013-10-31 13:52:44 Bandydan

回答