如何获取页面内容

我正在尝试为我的网站制作最新消息，如功能。对于这一点，我已经做了以下如何获取页面内容

$dom = new domDocument; 
@$dom->loadHTML(file_get_contents($url)); 
$dom->preserveWhiteSpaces = false; 
$linksToStore = $dom->getElementsByTagName('a'); 

foreach($linksToStore as $tag){ 
    $links[$tag->getAttribute('href')]= $tag->childNodes->item(0)->nodeValue; 
}

我怎样才能获得内容不被那些与特定域的链接指向的网页做了一个网络爬虫，并具有能够收集来自网页链接起来到现在在我的情况下是'医疗'？

来源

2012-11-25 vaibhav

使用此http://simplehtmldom.sourceforge.net/库从页面提取内容。选择器的工作原理与jQuery相同，这使得它可以非常有效地提取内容。

此外，请检查此http://davidwalsh.name/php-notifications以了解更多

来源

2012-11-25 08:24:29

如何获取页面内容

回答

相关问题