Python/Jython解析电子邮件

-1

我已经使用Jython解析了电子邮件消息以获取电子邮件消息正文值。现在我有身体的价值，我想从中提取以下文本。Python/Jython解析电子邮件

正文包含文本，我想以提取下列文字：

有在体内发现行：

[type]: mail 
[category]: Values 
[service]: testing 
[description]: Testing out automapping of email 
Line break Testing out automapping of email 
Line break Testing out automapping of email

现在我想提取[描述后，所有的值]：这可能吗？我尝试这样做：

desc = '[description]:' 
res = findall("{}.*".format(desc), body)[0]

来源

2015-06-02 user2023042

你说，体内含有HTML **和* *文本。 HTML在哪里？ – Squall

对不起更新的问题： – user2023042

确定使用这个：res = findall（“％s。*”％'[description]：'，body）我只得到一行..我如何包含文本的所有行？ – user2023042

一个正则表达式可能的解决方案，但考虑@StefanNch建议：

\[description\]:((?:.+\n?)*)

import re 
p = re.compile(ur'\[description\]:((?:.+\n?)*)') 
test_str = u" [type]: mail\n [category]: Values\n [service]: testing\n [description]: Testing out automapping of email\n Line break Testing out automapping of email\n Line break Testing out automapping of email" 
subst = u"" 

result = re.sub(p, subst, test_str) 

re.search(p, test_str)

DEMO

来源

2015-06-02 17:11:33

嗨，感谢您的建议，但由于产品限制，我无法使用BeautifulSoup。好的，打印时会输出一个真值。如何打印[说明]后面的文本：？ – user2023042

print re.findall（p，text）这将打印想要的结果，但它也会打印[描述]：当它只应打印之后的值。任何线索？ – user2023042

我使用捕获组更新了答案，所以它只捕获'[description]：' –

Python/Jython解析电子邮件

回答

相关问题