PHP删除链接到特定的网站，但保留文本

例如，<a href="http://msdn.microsoft.com/art029nr/">remove links to here but keep text</a> but <a href="http://herpyderp.com">leave all other links alone</a>PHP删除链接到特定的网站，但保留文本

我一直在试图解决这个使用的preg_replace。我在这里搜索并找到解决问题的答案。

PHP: Remove all hyperlinks of specific domain from text的答案删除指向特定网址的链接，但也删除文本。

http://php-opensource-help.blogspot.ie/2010/10/how-to-remove-hyperlink-from-string.html的网站从字符串中删除超链接，但我似乎无法修改该模式，因此它只适用于特定的网站。

来源

2013-02-11 Danny

[不要使用正则表达式解析HTML]（http://stackoverflow.com/a/1732454/344643）使用[XML解析器]（http://us2.php.net/manual/en/class。 domdocument.php）代替。 – 2013-02-11 00:13:02

$html = '...I can haz HTML?...'; 
$whitelist = array('herpyderp.com', 'google.com'); 

$dom = new DomDocument(); 
$dom->loadHtml($html);  
$links = $dom->getELementsByTagName('a'); 

foreach($links as $link){ 
    $host = parse_url($link->getAttribute('href'), PHP_URL_HOST); 

    if($host && !in_array($host, $whitelist)){  

    // create a text node with the contents of the blacklisted link 
    $text = new DomText($link->nodeValue); 

    // insert it before the link 
    $link->parentNode->insertBefore($text, $link); 

    // and remove the link 
    $link->parentNode->removeChild($link); 
    } 

} 

// remove wrapping tags added by the parser 
$dom->removeChild($dom->firstChild);    
$dom->replaceChild($dom->firstChild->firstChild->firstChild, $dom->firstChild); 

$html = $dom->saveHtml();

对于那些害怕使用的DomDocument代替preg_replace出于性能原因，我这样做，并在Q（一个完全移除链接）=>的DomDocument是链接的码之间的快速测试只有〜4倍慢。

来源

2013-02-11 00:31:42

非常感谢。该网址是一个子域似乎导致了一个问题，但我可以通过输入第一部分来解决这个问题。未删除的唯一链接是使用逗号和引号的网址警告：DOMDocument :: loadHTML（）：htmlParseEntityRef。你知道解决这个问题的方法吗？再次感谢。 – Danny 2013-02-11 02:23:42

如果HTML格式错误 - 禁用错误（请参阅[此答案]（http://stackoverflow.com/a/7082487/1058140））。我只在这里做了一个主机检查。如果你想对整个url执行检查，路径等等，请阅读'parse_url（）'的文档页面 – 2013-02-11 02:39:08

PHP删除链接到特定的网站，但保留文本

回答

相关问题