2011-05-03 58 views
2

我已经通过其他的问题在这个网站阅读 - 使用此处给出的示例答案 -解析维基百科介绍PHP

wikipedia api: get parsed introduction only

我得从哪里获得维基百科文章的第一部分后面的阶段。但第一部分包括图片和文字。我想要的只是文字。这里是从我的卷曲响应

$ Array 
(
[parse] => Array 
    (
     [text] => Array 
      (
       [*] => <div class="dablink">This article is about sports known as football. For the ball used in these sports, see <a href="/wiki/Football_(ball)">Football (ball)</a>.</div> 
    <div class="thumb tright"> 
    <div class="thumbinner" style="width:227px;"><a href="/wiki/File:Football4.png" class="image"><img alt="" src="http://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/Football4.png/225px- Football4.png" width="225" height="274" class="thumbimage" /></a> 
    <div class="thumbcaption"> 
    <div class="magnify"><a href="/wiki/File:Football4.png" class="internal" title="Enlarge"><img src="http://bits.wikimedia.org/skins-1.17/common/images/magnify- clip.png" width="15" height="11" alt="" /></a></div> 
    Some of the many different games known as football. From top left to bottom right:  <a href="/wiki/Association_football">Association football</a> or soccer, <a href="/wiki/Australian_rules_football">Australian rules football</a>, <a href="/wiki/International_rules_football">International rules football</a>, <a href="/wiki/Rugby_Union" class="mw-redirect" title="Rugby Union">Rugby Union</a>, <a href="/wiki/Rugby_League" class="mw-redirect" title="Rugby League">Rugby League</a>, and <a href="/wiki/American_Football" class="mw-redirect" title="American Football">American Football</a>.</div> 
    </div> 
    </div> 
    <p>The game of <b>football</b> is any of several similar <a href="/wiki/Team_sport" title="Team sport">team sports</a>, of similar origins which involve advancing a ball into a goal area in an attempt to score. Many of these involve <a href="/wiki/Kick_(football)" title="Kick (football)">kicking</a> a ball with the foot to score a <a href="/wiki/Goal_(sport)" title="Goal (sport)">goal</a>, though not all codes of football using kicking as a primary means of advancing the ball or scoring. The most popular of these sports worldwide is <a href="/wiki/Association_football">association football</a>, more commonly known as just "football" or "soccer". Unqualified, the word <i><a href="/wiki/Football_(word)" title="Football (word)">football</a></i> applies to whichever form of football is the most popular in the regional context in which the word appears, including <a href="/wiki/American_football">American football</a>, <a href="/wiki/Australian_rules_football">Australian rules football</a>, <a href="/wiki/Canadian_football">Canadian football</a>, <a href="/wiki/Gaelic_football">Gaelic football</a>, <a href="/wiki/Rugby_league">rugby league</a>, <a href="/wiki/Rugby_union">rugby union</a> and other related games. These variations are known as "codes".</p> 
    <div class="toclimit-3"></div> 

其实我是想地处如果多数民众赞成任何使用段落标记的代码输出HTML? (开始的话 - 即抓住在PHP中的数据

我的网址链接“的游戏”是这样的 -

'http://en.wikipedia.org/w/api.php?action=parse&page='.$search.'&redirects=1&format=json&prop=text&section=0' 

示例代码,我已经尝试 -

<?php 

include_once('simple_html_dom.php'); 

$html = file_get_html('http://amazon.co.uk/'); 

foreach($html->find('p') as $element) 
{ 
echo $element->plaintext . '<br>'; 
} 

?> 

这个不幸返回一个空白页

回答

1

只需下载Simple HTML DOM parser

然后使用这个:

include_once('simple_html_dom.php'); 

$html = file_get_html('http://en.wikipedia.org/wiki/Football'); 

foreach($html->find('p') as $element) 
{ 
    echo $element->plaintext . '<br>'; 
    break; 
} 
+0

我已经试过这与维基和亚马逊没有运气,只是检索空白页。包括dom解析器aswell - – DIM3NSION 2011-05-03 11:45:16

+0

使用此代码 - <?php include_once('simple_html_dom.php'); $ html = file_get_html('http://www.amazon.co.uk/'); ($ html-> find('p')as $ element) { echo $ element-> plaintext。 '
'; 休息; } ?> – DIM3NSION 2011-05-03 11:45:24

+0

可能有很多解释,但它适用于我,如果我使用'$ html = file_get_html('http://amazon.co.uk/');'。如果这是您使用的网址,我无法在您的评论中看到,因为此处的评论系统已将其转换为可点击的链接。 – 2011-05-03 13:45:15