我想解析一个文档,并获取所有图像标签,并更改不同的东西来源。php DomDocument添加额外的标签
$domDocument = new DOMDocument();
$domDocument->loadHTML($text);
$imageNodeList = $domDocument->getElementsByTagName('img');
foreach ($imageNodeList as $Image) {
$Image->setAttribute('src', 'lalala');
$domDocument->saveHTML($Image);
}
$text = $domDocument->saveHTML();
的$文本最初看起来是这样的:
<p>Hi, this is a test, here is an image<img src="http://mysite.com/beer.jpg" width="60" height="95" /> Because I like Beer!</p>
,这是输出$文字:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>Hi, this is a test, here is an image<img src="lalala" width="68" height="95"> Because I like Beer!</p></body></html>
我收到了一堆额外的标记(HTML,身体,以及顶部的评论),我并不需要。任何方式来设置DOMDocument以避免添加这些额外的标签?
谢谢!
它应该是:$ text = preg_replace('/^ /','',str_replace(array('','','
',''),array('','',' ',''),$ domDocument-> saveHTML())); – 2011-05-14 06:58:13`preg_replace`,真的吗? – sglessard 2017-09-15 16:52:45