如何获取网站的特定部分或div

我想要做的是：从http://reddit.com/r/worldnews的顶部帖子中获取文本标题，并将其输出到我的网页上，该页面上只有该文本。如何获取网站的特定部分或div

最后，我想从该网页抓取使用AppleScript cURL并输出的文本。

我正在制作一个脚本，当我点击按钮它会告诉我最高的职位。

编辑如果你能想到任何方式，我想做同样的事情，但对Facebook的通知。

编辑我有PHP抓取网站并输出这里：http://colejohnsoncreative.com/personal/ai/worldnews.php这是我使用的代码：

<?php 
// Get a file into an array. In this example we'll go through HTTP to get 
// the HTML source of a URL. 
$lines = file('http://www.reddit.com/r/worldnews'); 

// Loop through our array, show HTML source as HTML source; and line numbers too. 
foreach ($lines as $line_num => $line) { 
    echo "Line #<b>{$line_num}</b> : " . htmlspecialchars($line) . "<br />\n"; 
} 

// Another example, let's get a web page into a string. See also file_get_contents(). 
$html = implode('', file('http://www.example.com/')); 

// Using the optional flags parameter since PHP 5 
$trimmed = file('somefile.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES); 
?>

所以我得到的所有网站的代码输出的，但我所需要的该项目是

<a class="title " href="http://www.dailymail.co.uk/news/article-2219477/Cannabis-factory-couple-gave-400-000-drug-dealing-fortune-poor-Kenyans-jailed-years.html" >British couple who spent most of the money they made from canabis growing on paying for life changing operations and schooling for people in a poor Kenyan village gets sent to prison for 3 years.</a>

和其他所有我需要扔掉，我该怎么做？

来源

2012-10-19 Cole

看看SCRAPPING方法http://stackoverflow.com/questions/26947/how-to-implement-a-web-scraper-in-php – Steven

如果youre在外壳里，你可以wget的页面

从PHP，你可以的file_get_contents页面

从Java中，你可以用的URLConnection得到它

一旦你有它使用什么语言，你想通过网页的文字看你想要什么，并做任何你喜欢它

来源

2012-10-19 02:06:16 case1352

你必须做一些解析。所以匹配你想要的模式。最简单的做法就是像str_pos那样获取所需元素的位置或使用正则表达式。他们有RSS源吗？如果是这样，你应该使用它。

来源

2012-10-19 03:15:33 xelber

如何获取网站的特定部分或div

回答

相关问题