您可以使用PHP's DOM module了点。用DOMDocument :: loadhtmlfile()读取页面,然后创建一个DOMXPath对象并查询具有class =“page-numbers”属性的文档中的所有span元素。
(编辑:哎呀,这不是你要找的内容,请参阅第二代码片段)
$html = '<html><head><title>:::</title></head><body>
<div class="pager">
<span class="page-numbers current">1</span>
<a href="/users?page=2" title="go to page 2"><span class="page-numbers">2</span></a>
<a href="/users?page=3" title="go to page 3"><span class="page-numbers">3</span></a>
<a href="/users?page=4" title="go to page 4"><span class="page-numbers">4</span></a>
<a href="/users?page=5" title="go to page 5"><span class="page-numbers">5</span></a>
<span class="page-numbers dots">…</span>
<a href="/users?page=15" title="go to page 15"><span class="page-numbers">15</span></a>
<a href="/users?page=2" title="go to page 2"><span class="page-numbers next"> next</span></a>
</div>
</body></html>';
$doc = new DOMDocument;
// since the content "is already here" we use loadhtml(content)
// instead of loadhtmlfile(url)
$doc->loadhtml($html);
$xpath = new DOMXPath($doc);
$nodelist = $xpath->query('//span[@class="page-numbers"]');
echo 'there are ', $nodelist->length, ' span elements having class="page-numbers"';
编辑:这是否
<a href="/users?page=15" title="go to page 15"><span class="page-numbers">15</span></a>
(倒数第二a
元素)总是点到最后一页,即这个链接是否包含你正在寻找的值?
然后,您可以使用XPath表达式来选择第二个元素,但最后一个元素为a
,并从那里选择子元素span
。
//div[@class="pager"] <- select each <div> where the attribute class equals "pager"
//div[@class="pager"]/a <- select each <a> that is a direct child of the pager div
//div[@class="pager"]/a[position()=last()-1] <- select the <a> that is second but last
//div[@class="pager"]/a[position()=last()-1]/span <- select the direct child <span> of that second but last <a> element in the pager <div>
(你可能希望取得一个良好的XPath教程;-))
$doc->loadhtml($html);
$xpath = new DOMXPath($doc);
$nodelist = $xpath->query('//div[@class="pager"]/a[position()=last()-1]/span');
if (0 < $nodelist->length) {
echo $nodelist->item(0)->nodeValue;
}
else {
echo 'not found';
}
真棒 - 感谢我期待着它 – 2009-10-20 14:47:04
您好我试过,但它返回零个 功能getusers($ userurl) { $ doc = new DOMDocument; $ doc-> loadhtml($ userurl); $ xpath = new DOMXPath($ doc); $ nodelist = $ xpath-> query('// span [@ class =“page-numbers”]'); print_r($ nodelist); echo'there are',$ nodelist-> length,'span class having class =“page-numbers”'; } 该URL是http://ask.recipelabs.com/users – 2009-10-20 19:27:33
如果你传递的url需要loadhtmlFILE(),而不是loadhtml()。 – VolkerK 2009-10-20 19:35:45