问题用了preg_replace

我有这行的伟大工程：

$listing['biz_description'] = preg_replace('/<!--.*?--\>/','',$listing['biz_description']);

什么是正确的正则表达式来删除HTML实体版本？

这是实体：

&lt;!-- --&gt;

来源

2010-08-11 Joe

我只想解码HTML实体如果你很高兴与preg_replace函数的正则表达式你已经有 ... html_entity_decode作为@ircmaxell提到的，使用正则表达式的HTML解析可以是非常痛苦的。

$str = "This is a <!-- test --> of the emergency &lt;!-- broadcast --&gt; system"; 
$str = preg_replace('/<!--.*?--\>', '' ,html_entity_decode($str)); 
echo $str;

来源

2010-08-11 20:42:03 sberry

咄，我应该想到这一点早笑。所有的正则表达式都在解析描述字段，所以它对服务器不是很重要。谢谢！ – Joe 2010-08-11 20:55:47

NEVER use regex to parse HTML/XML ...

用的DomDocument的实现（假设有效的XML）：

$dom = new DomDocument(); 
$dom->loadXml($listing['biz_description']); 
removeComments($dom); 
$listing['biz_description'] = $dom->saveXml(); 

function removeComments(DomNode $node) { 
    if ($node instanceof DomComment) { 
     $node->parentNode->removeChild($node); 
    } else { 
     foreach ($node->childNodes as $child) { 
      removeComments($child); 
     } 
    } 
}

来源

2010-08-11 20:44:51 ircmaxell

问题用了preg_replace

回答

相关问题