2014-07-08 89 views
1

我想使用preg_replace删除样式标记中包含的任何东西。例如:preg_replace任何样式标记表达式

<img src="image.jpg" style="float:left;" /> 

将更改为:

<img src="image.jpg" /> 

同样:

<a href="link.html" style="color:#FF0000;" class="someclass">Link</a> 

将更改为:

<a href="link.html" class="someclass">Link</a> 

我怎么会写这个正则表达式?

preg_replace('EXPRESSION', '', $string); 

回答

1

这应该工作:

preg_replace("@(<[^<>]+)\sstyle\=[\"\'][^\"\']+[\"\']([^<>]+>)@i", '$1$2', $string); 
+0

谢谢。非常感激! – JROB

+0

你应该添加\ s?可选的领先空间就在风格之前,或者你留下一个双倍空间,:-p,也是你的第二个离开“链接”,因为没有捕获组,并且你匹配标签的起始<标签 – ArtisticPhoenix

+0

@ArtisiticPhoenix同意。更新。 –

1

查找style="..."是封闭的内部<>与匹配组替换$1$2

(<.*)style="[^"]*"([^>]*>) 

Online Demo


这里是working sample code

示例代码:

<?php 
    $re = "/(<.*)style=\"[^\"]*\"([^>]*>)/"; 
    $str = "<img src=\"image.jpg\" style=\"float:left;\" />\n\n<a href=\"link.html\" style=\"color:#FF0000;\" class=\"someclass\">Link</a>"; 
    $subst = '$1$2'; 

    $result = preg_replace($re, $subst, $str); 
    print $result; 
?> 

输出:

<img src="image.jpg" /> 

<a href="link.html" class="someclass">Link</a> 
0

这是最好的我能想出

$re = "/\sstyle\=('|\").*?(?<!\\\\)\1/i"; 
$str = "<a href=\"link.html\" style=\"color:#FF0000;\"\" class=\"someclass\">Link</a>"; 
$subst = ''; 

$result = preg_replace($re, $subst, $str, 1); 

输出

<a href="link.html" class="someclass">Link</a> 

演示:

http://regex101.com/r/uW2kB8/8

说明:

\s match any white space character [\r\n\t\f ] 
style matches the characters style literally (case insensitive) 
\= matches the character = literally 
1st Capturing group ('|") 
    1st Alternative: ' 
     ' matches the character ' literally 
    2nd Alternative: " 
     " matches the character " literally 
.*? matches any character (except newline) 
    Quantifier: Between zero and unlimited times, as few times as possible, expanding as needed [lazy] 
(?<!\\) Negative Lookbehind - Assert that it is impossible to match the regex below 
    \\ matches the character \ literally 
\1 matches the same text as most recently matched by the 1st capturing group 
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z]) 

甚至会处理这样的情况。

<a href="link.html" style="background-image:url(\"..\somimage.png\");" class="someclass">Link</a> 

<a href="link.html" style="background-image:url('..\somimage.png');" class="someclass">Link</a> 

和(它不会删除)

<a href="link.html" data-style="background-image:url('..\somimage.png');" class="someclass">Link</a> 

甚至

<a href='link.html' style='color:#FF0000;' class='someclass'>Link</a> 

http://regex101.com/r/uW2kB8/11

不像其他建议:)

3

我建议使用正确的tool作为工作,并避免使用正则表达式。

$dom = new DOMDocument; 
$dom->loadHTML($html); 

$xpath = new DOMXPath($dom); 

foreach ($xpath->query('//*[@style]') as $node) { 
    $node->removeAttribute('style'); 
} 

echo $dom->saveHTML(); 

Working Demo

如果必须使用正则表达式完成这个任务,下面就足够了。

$html = preg_replace('/<[^>]*\Kstyle="[^"]*"\s*/i', '', $html); 

说明

<   # '<' 
[^>]*  # any character except: '>' (0 or more times) 
\K   # resets the starting point of the reported match 
style=" # 'style="' 
    [^"]*  # any character except: '"' (0 or more times) 
    "   # '"' 
\s*   # whitespace (\n, \r, \t, \f, and " ") (0 or more times) 

Working Demo

+0

+1提供**都是DOM解决方案和高效的正则表达式! :) – zx81