我想分割一个包含正常文本的字符串以及html代码到字符串数组中。我试图搜索谷歌,但没有找到任何合适的建议。单独的html编码字符串和普通字符串
考虑以下字符串:
blahblahblahblahblahblahblahblahblahblah
blahblah首先对blahblahblahblah
blahblahblahblahblahblahblahblahblahblah<html> <body> <p>hello</p> </body> </html>
blahblahblahblahblahblahblahblahblahblah
blahblah二帕拉lahblahblahblahblah
blahblahblahblahblahblahblahblahblahblah
变为:
s[0]=whole first para
s[1]=html code
s[2]=whole second para
是否有可能通过jsoup
?或者我需要其他API?
你能不能简单地搜索和标签? – Floris
我的字符串并不总是包含html标签字符串也可以只包含body标签或任何其他html标签 –
有没有像你的例子一样有一个字符串结构的好理由? – KarelG