2013-01-11 36 views
2

我正在构建我的自定义库来合并所有屏幕CSS样式表,但我不知道如何获取仅适用于媒体类型screen的样式表。例如:如何使用php dom xpath或regex获取样式表网址?

<!-- This should be fetched --> 
<link href="http://www.domain.com/style.css" rel="stylesheet" type="text/css" /> 
<!-- This should be fetched --> 
<link href="http://www.domain.com/ie.css" rel="stylesheet" type="text/css" /> 

<style type="text/css" media="all"> 
    <!-- This should be fetched --> 
    @import url("http://static.php.net/www.php.net/styles/phpnet.css"); 
</style> 

<style type="text/css" media="screen"> 
    <!-- This should be fetched --> 
    @import url("http://static.php.net/www.php.net/styles/site.css"); 
</style> 

<style type="text/css" media="print"> 
    <!-- This should NOT be fetched since it is media type print --> 
    @import url("http://static.php.net/www.php.net/styles/print.css"); 
</style> 

鉴于上面的字符串,我只是想提取hrefurl值。我不知道如何解决这个问题。虽然我确实尝试过:

preg_match_all("/(url\([\'\"]?)([^\"\'\)]+)([\"\']?\))/", $html, $matches); 
print_r($matches); 

但它不返回它。

任何使用php dom,xpath或正则表达式来实现的解决方案?

+0

你将要分析DOM。这可能会让你更容易:http://code.google.com/p/ganon/(免责声明:我从来没有用过它,但看起来像它将支持你所需要的) – Levi

回答

4

这是工作代码! 我创建了一个键盘引擎收录还为您提供:http://codepad.org/WQzcO3k3

<?php 

$inputString = '<!-- This should be fetched --> 
<link href="http://www.domain.com/style.css" rel="stylesheet" type="text/css" /> 
<!-- This should be fetched --> 
<link href="http://www.domain.com/ie.css" rel="stylesheet" type="text/css" /> 

<style type="text/css" media="all"> 
    <!-- This should be fetched --> 
    @import url("http://static.php.net/www.php.net/styles/phpnet.css"); 
</style> 

<style type="text/css" media="screen"> 
    <!-- This should be fetched --> 
    @import url("http://static.php.net/www.php.net/styles/site.css"); 
</style> 

<style type="text/css" media="print"> 
    <!-- This should NOT be fetched since it is media type print --> 
    @import url("http://static.php.net/www.php.net/styles/print.css"); 
</style>'; 
$outputUrls = array(); 

@$doc = new DOMDocument(); 
@$doc->loadHTML($inputString); 
$xml = simplexml_import_dom($doc); // just to make xpath more simple 

$linksOrStyles = $xml->xpath('//*[@rel="stylesheet" or @media="all" or @media="screen"]');  


//print_r($linksOrStyles); 

foreach ($linksOrStyles as $linkOrStyleSimpleXMLElementObj) 
{ 
    if ($linkOrStyleSimpleXMLElementObj->xpath('@href') != false) { 
     $outputUrls[] = $linkOrStyleSimpleXMLElementObj['href'] . ''; 
    } else { 
     //get the 'url' value. 
     $httpStart = strpos($linkOrStyleSimpleXMLElementObj.'', 'http://'); 
     $httpEnd = strpos($linkOrStyleSimpleXMLElementObj.'', '"', $httpStart); 
     $outputUrls[] = substr($linkOrStyleSimpleXMLElementObj.'', $httpStart, ($httpEnd - $httpStart)); 
     //NOTE:Use preg_match only to get URL. i had to use strpos here 
     //since codepad.org doesnt suport preg 
     /* 
     preg_match(
      "#((http|https|ftp)://(\S*?\.\S*?))(\s|\;|\)|\]|\[|\{|\}|,|\"|'|:|\<|$|\.\s)#ie", 
      ' ' . $linkOrStyleSimpleXMLElementObj, 
      $matches 
     ); 
     print_r($matches); 
     $outputUrls[] = $matches[0]; 
     */ 
    } 
} 

echo 'Output Url list: '; 
print_r($outputUrls); 

?> 
+1

谢谢@ Dev01的控制权 – OMG