2014-05-14 29 views
1

有人可以帮我一把。用Div PHP PHP Dom

我试图从一些页面获取信息,看起来像这样的HTML。

<div class="block"> 
    <h2>Season 1</h2> 
    <div class="episode"><a href="somelink.com">Episode 1</a></div> 
    <div class="episode"><a href="somelink.com">Episode 2</a></div> 
    <h2>Season 2</h2> 
    <div class="episode"><a href="somelink.com">Episode 1</a></div> 
</div> 

但是,我坚持上是每个赛季,我想包起来的div在div里面的季节发作例如

<div class="block"> 
    <div class="season"> 
     <h2>Season 1</h2> 
     <div class="episode"><a href="somelink.com">Episode 1</a></div> 
     <div class="episode"><a href="somelink.com">Episode 2</a></div> 
    </div> 
    <div class="season"> 
     <h2>Season 2</h2> 
     <div class="episode"><a href="somelink.com">Episode 1</a></div> 
    </div> 
</div> 

而且PHP代码我与

工作
$page = "someurl.com"; 

$page = $this->curl->get($page); 
$dom = new DOMDocument(); 
@$dom->loadHTML($page); 

$divs = $dom->getElementsByTagName('div'); 
for($i=0;$i<$divs->length;$i++){ 
    if ($divs->item($i)->getAttribute("class")=="block") { 
    $h2s = $divs->item($i)->getElementsByTagName('h2'); 
    if (count($h2s) > 0) { 
     foreach ($h2s as $h2) { 
     // Stuck at this point 
     } 
    } 
    } 
} 

我该如何在PHP DOM中做到这一点,有人请给我一个例子谢谢。

+2

Regardlass谁可能解决这个问题为您,我们都喜欢你去尝试,并且告诉你在你的问题已经试过什么。这样你就可以了解你做错了什么/不正确。 – bestprogrammerintheworld

+0

更新我的问题 – user3375691

+0

你用什么来表示/解析DOM结构? –

回答

1

下面的代码包每个<h2>及其.episode兄弟姐妹.season容器

$page = '<div class="block"> 
     <h2>Season 1</h2> 
     <div class="episode"><a href="s1ep1.com">Episode 1</a></div> 
     <div class="episode"><a href="s1ep2.com">Episode 2</a></div> 
     <h2>Season 2</h2> 
     <div class="episode"><a href="s2ep1.com">Episode 1</a></div> 
     <div class="episode"><a href="s2ep1.com">Episode 2</a></div> 
    </div>'; 

    $dom = new DOMDocument(); 

    $origVal = libxml_use_internal_errors(true); 
    @$dom->loadHTML($page); 
    libxml_clear_errors(); 
    libxml_use_internal_errors($origVal); 

    //create a tmeplate 'season' div 
    $season = $dom->createElement('div'); 
    $season->setAttribute('class', 'season'); 

    //get all '.block' divs using xpath 
    $xpath = new DOMXPath($dom); 
    $divs = $xpath->query("//*[@class='block']"); 

    $clones = array(); 
    $clone = ''; 

    foreach($divs as $currDiv) { 

    //check if the 'block' contains any <h2> elemnts, if not, skip this block 
    if(!count($currDiv->getElementsByTagName('h2'))) { 
     continue; 
    } 

    foreach($currDiv->childNodes as $child) { 

     if(in_array($child->nodeName, array(
              '#text', 
              '#comment' 
            )) 
     ) { 
      //ignore white space (and text content), and comments in 'block' div 
      continue; 
     } 

     if($child->nodeName == 'h2') { 
      if($clone) { 
       //save all clones of 'season' template div in an array for further use 
       $clones[] = $clone; 
      } 

      $clone = $season->cloneNode(true); 
     } 

     //this is the tricky part. If we do not append a clone of original div, then it actually moves the div to $clone. This changes HTML structure and disrupts the current loop 
     //so we append the clones of child to the 'season' div 
     if($child->nodeName == 'h2' || $child->getAttribute('class') == 'episode') { 
      $clone->appendChild($child->cloneNode(true)); 
     } 
    } 
    $clones[] = $clone; 

    //remove all children of current 'block' div 
    while($currDiv->childNodes->length) { 
     $currDiv->removeChild($currDiv->firstChild); 
    } 

    //isnert all 'season' nodes in it 
    foreach($clones as $c) { 
     $currDiv->appendChild($c); 
    } 
    } 

    echo $dom->saveHTML(); 
+0

我非常感谢您花时间写这篇文章。非常感谢你:) – user3375691