2010-07-06 135 views
3

我想解析一个html文件。php xpath:查询结果内的查询

这个想法是用titledesc类获取跨度,并在每个具有属性class ='thebest'的div中获取它们的信息。

这里是我的代码:

<?php 

$example=<<<KFIR 
<html> 
<head> 
<title>test</title> 
</head> 
<body> 
<div class="a">moshe1 
<div class="aa">haim</div> 
</div> 
<div class="a">moshe2</div> 
<div class="b">moshe3</div> 

<div class="thebest"> 
<span class="title">title1</span> 
<span class="desc">desc1</span> 
</div> 
<div class="thebest"> 
span class="title">title2</span> 
<span class="desc">desc2</span> 
</div> 

</body> 
</html> 
KFIR; 


$doc = new DOMDocument(); 
@$doc->loadHTML($example); 
$xpath = new DOMXPath($doc); 
$expression="//div[@class='thebest']"; 
$arts = $xpath->query($expression); 

foreach ($arts as $art) { 
    $arts2=$xpath->query("//span[@class='title']",$art); 
    echo $arts2->item(0)->nodeValue; 
    $arts2=$xpath->query("//span[@class='desc']",$art); 
    echo $arts2->item(0)->nodeValue; 
} 
echo "done"; 

预期的结果是:

title1desc1title2desc2done 

是我收到的结果是:

title1desc1title1desc1done 

回答

10

使相关查询...以点开始(例如".//…")。

foreach ($arts as $art) { 
    // Note: single slash (direct child) 
    $titles = $xpath->query("./span[@class='title']", $art); 
    if ($titles->length > 0) { 
     $title = $titles->item(0)->nodeValue; 
     echo $title; 
    } 

    $descs = $xpath->query("./span[@class='desc']", $art); 
    if ($descs->length > 0) { 
     $desc = $descs->item(0)->nodeValue; 
     echo $desc; 
    } 
} 
1

而不是做第二次查询尝试textContent

foreach ($arts as $art) { 
    echo $art->textContent; 
} 

textContent返回此节点及其后代的文本内容。

作为替代方案,所述的XPath更改为

$expression="//div[@class='thebest']/span[@class='title' or @class='desc']"; 
$arts = $xpath->query($expression); 

foreach ($arts as $art) { 
    echo $art->nodeValue; 
} 

这将与THEBEST具有类标题或内容描述的一类取的div的跨度的儿童。