如何用美丽的汤删除“文本标签”

请告诉我如何从这样的html中删除文本标签，并离开子元素。如何用美丽的汤删除“文本标签”

<text _ngcontent-c0="" _nghost-c2=""> 
    <p>sample text</p> 
</text> 
<image> 
    <figure> 
     <img alt="" src="xxxxx.jpg"/> 
    </figure> 
</image>

我想把它转换成如下

<p>sample text</p> 
<image> 
    <figure> 
     <img alt="" src="xxxxx.jpg"/> 
    </figure> 
</image>

我尝试以下方法，但一个错误'str' object has no attribute 'unwrap'发生。

from bs4 import BeautifulSoup 

content = '<text _ngcontent-c0="" _nghost-c2=""> 
      <p>sample text</p> 
      </text> 
      <image> 
      <figure> 
       <img alt="" src="xxxxx.jpg"/> 
      </figure> 
      </image>' 

while (content.text): 
    content.text.unwrap()

来源

2017-05-24 xKxAxKx

你可以得到这样的“展开”元素：

from bs4 import BeautifulSoup 

content = '<text _ngcontent-c0="" _nghost-c2=""><p>sample text</p></text><image><figure><img alt="" src="xxxxx.jpg"/></figure></image>' 

soup = BeautifulSoup(content) 
for p in soup.find_all('p'): 
    p.parent.unwrap() 
    print(p.parent) # prints <p>sample text</p><image><figure><img alt="" src="xxxxx.jpg"/></figure></image>

从你提供的代码，就好像你不使用BeautifulSoup所有，而不是你想上使用unwrap方法一个普通的字符串，因此你提到的错误。
如果您使用的是BeatifulSoup，请提供您用于解析HTML的其他代码。

来源

2017-05-24 10:09:53 errata

对不起，没有描述。我想知道如何在内容中有其他元素时作出响应。我更新了我的问题。 – xKxAxKx

我的例子也适用于你的更新案例。它应该返回'

示例文本

'。我更新了我的答案，以澄清您遇到的问题。 – errata

如何用美丽的汤删除“文本标签”

回答

相关问题