如何使用正则表达式去除标签及其内容？

$ str ='一些文字标签内容更多文字';如何使用正则表达式去除标签及其内容？

我的问题是：如何检索内容tag <em>contents </em>这是介于<MY_TAG> .. </MY_TAG>之间？

而且

如何从$str<MY_TAG>及其内容？

我正在使用PHP。

谢谢。

2010-03-04 user187580

我不知道有多少次以下答案是在任何一天连接：http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454# 1732454 – Nicole 2010-03-04 18:22:18

HTML语法分析器等等等等......你知道演练。 – 2010-03-04 18:22:19

如果MY_TAG不能嵌套，试试这个让比赛：

preg_match_all('/<MY_TAG>(.*?)<\/MY_TAG>/s', $str, $matches)

，并将它们删除，请使用preg_replace代替。

来源

2010-03-04 18:22:26 Gumbo

hii .. whats/s for ??谢谢你的回答 – user187580 2010-03-04 18:27:24

@ user187580：* s *标志使'.'匹配行中断。请参阅http://php.net/manual/en/reference.pcre.pattern.modifiers.php – Gumbo 2010-03-04 18:33:34

如果您不止一次在字符串中找到此标记，那么您最好设置该模式的不确定性。否则，你会发现，你转换这个字符串 “这是一个设置线非常重要” 成“这是行” – Don 2016-01-18 18:44:39

虽然要做到这一点唯一完全正确的方法是不使用正则表达式，你可以得到你想要的东西，如果你接受它不能处理所有的特殊情况：

preg_match("/<em[^>]*?>.*?</em>/i", $str, $match); 
// Use this only if you aren't worried about nested tags. 
// It will handle tags with attributes

而且

preg_replace(""/<MY_TAG[^>]*?>.*?</MY_TAG>/i", "", $str);

来源

2010-03-04 18:29:49 Nicole

你不想为此使用正则表达式。一个更好的解决办法是将内容加载到DOMDocument并使用DOM树和标准DOM方法上它的工作：

$document = new DOMDocument(); 
$document->loadXML('<root/>'); 
$document->documentElement->appendChild(
    $document->createFragment($myTextWithTags)); 

$MY_TAGs = $document->getElementsByTagName('MY_TAG'); 
foreach($MY_TAGs as $MY_TAG) 
{ 
    $xmlContent = $document->saveXML($MY_TAG); 
    /* work on $xmlContent here */ 

    /* as a further example: */ 
    $ems = $MY_TAG->getElementsByTagName('em'); 
    foreach($ems as $em) 
    { 
     $emphazisedText = $em->nodeValue; 
     /* do your operations here */ 
    } 
}

来源

2010-03-04 23:00:02 Kris

对于去除最后我只是用这样的：

$str = preg_replace('~<MY_TAG(.*?)</MY_TAG>~Usi', "", $str);

使用〜而不是/由于末尾标记中的反斜杠解决了分隔符解决的错误，即使转义，这似乎也是一个问题。从开始标签中消除>允许属性或其他字符，并仍然获取标签及其所有内容。

这只适用于嵌套不重要的情况。

Usi修饰符表示U = Ungreedy，s =包含换行符，i =不区分大小写。

来源

2013-08-20 17:39:37 squarecandy

好工作（y）工作正常。g $ ptitle = preg_replace（'〜〜Usi'，“”，$ ptitleWithSpan）; – 2017-01-05 16:51:49

如何使用正则表达式去除标签及其内容？

回答

相关问题