PHP正则表达式删除最后一段（有属性）和内容

我的问题类似于this在Stackoverflow上问的问题。但是有一个区别。PHP正则表达式删除最后一段（有属性）和内容

，我有以下存储在MySQL表：

<p align="justify">First paragraph</p> 
<p>Second paragraph</p> 
<p>Third paragraph</p> 
<div class="item"> 
<p>Some paragraph here</p> 
<p><strong><u>Specs</u>:</strong><br /><br /><strong>Weight:</strong> 10kg<br /><br /><strong>LxWxH:</strong> 5mx1mx40cm</p 
<p align="justify">second last para</p> 
<p align="justify">This is the paragraph I am trying to remove with regex.</p> 
</div>

我试图删除最后一个段落标记和内容表中的每一行。在链接的问题中提到的最佳答案建议下面的正则表达式 -

preg_replace('~(.*)<p>.*?</p>~', '$1', $html)

从链接的问题不同的是 - 有时我的最后一个段落标记可以（或可能不会）有属性align="justify"。如果最后一个段落具有此属性，则提到的解决方案将删除不具有属性的内容的最后一段。因此，我正在努力寻找一种方法来删除最后一段，而不管它的属性状态如何。

来源

2016-01-02 Dr. Atul Tiwari

[除XHTML自足标签的正则表达式匹配开放标签]（可能的重复HTTP ：//stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-包含标签） –

@LucasTrzesniewski感谢您的链接。虽然我完全不理解它，但我已将它加入书签。 –

链接基本上说你应该使用正确的工具来完成这项工作。这里需要一个HTML解析器/ DOM操作库。使用正则表达式很脆弱 - 使用DOM（或XPath或CSS选择器）可以更好，更轻松地完成。 –

变化的正则表达式：

preg_replace('~(.*)<p[^>]*>.*</p>\R?~s', '$1', $html)

Regex101 Demo

正则表达式突围

~   # Opening regex delimiter 
    (.*)  # Select any chars matching till the last '<p>' tags 
      # (actually it matches till the end then backtrack) 
    <p[^>]*> # select a '<p>' tag with any content inside '<p .... >' 
      # the content chars after '<p' must not be the literal '>' 
    .*  # select any char till the '</p>' closing tag 
    </p>  # matches literal '</p>' 
    \R?  # select (to remove it) any newline (\r\n, \r, \n) 
~s   # Closing regex delimiter with 's' DOTALL flag 
      # (with 's' the '.' matches also newlines)

来源

2016-01-02 14:04:03

谢谢。有效。我认为你需要编辑答案，并从正则表达式中删除这些文本=> **强烈的文本** –

@ Dr.AtulTiwari：谢谢，奇怪的是它发生在我贴东西的时候！ –

PHP正则表达式删除最后一段（有属性）和内容

回答

相关问题