2016-02-28 145 views
1

PYTHON VERSION == 3.5解析电子邮件内容以电子邮件模块(PYTHON)

代码:

import getpass, poplib, email 
Mailbox = poplib.POP3_SSL('pop.googlemail.com', '995') 
Mailbox.user("[email protected]") 
Mailbox.pass_('password_here') 
numMessages = len(Mailbox.list()[1]) 
for i in range(numMessages): 
    info = b" ".join(Mailbox.retr(i+1)[1]) 
    msg = email.message_from_bytes(info) 
    print(msg.keys()) 

输出:

['MIME-Version'] 
['MIME-Version'] 
['MIME-Version'] 
['Delivered-To'] 
['Delivered-To'] 
['Delivered-To'] 
['Delivered-To'] 
['Delivered-To'] 
['Delivered-To'] 
['Delivered-To'] 
['Delivered-To'] 

输出不正确的,因为除"MIME-Version""Delivered-To"以外的 msg应该有更多的字段I tho ught

email.message_from_bytes()解析字节串

的内容msg不是一个字节的字符串?

docs建议这样的:

M = poplib.POP3('localhost') 
M.user(getpass.getuser()) 
M.pass_(getpass.getpass()) 
numMessages = len(M.list()[1]) 
for i in range(numMessages): 
    for j in M.retr(i+1)[1]: 
     print(j) 

有没有一种方法来解析使用电子邮件模块返回的消息? 因此我们可以存储电子邮件的详细信息。像发送者,身体,头部等

回答

1

的答案竟然是相当容易

import getpass, poplib, email 
Mailbox = poplib.POP3_SSL('pop.googlemail.com', '995') 
Mailbox.user("[email protected]") 
Mailbox.pass_('password_here') 
numMessages = len(Mailbox.list()[1]) 
for i in range(numMessages): 
    raw_email = b"\n".join(Mailbox.retr(i+1)[1]) 
    parsed_email = email.message_from_bytes(raw_email) 
    print(parsed_email.keys()) 

,而不是加入raw_email与空间只是由\n加入它和email模块能正确解析字段:

也是一个有关使用email模块 一个真棒的事情是,当你调用email.message_from_bytes()返回的输出是一个 dict

所以你访问的字段是这样的:

raw_email = b"\n".join(Mailbox.retr(i+1)[1]) 
parsed_email = email.message_from_bytes(raw_email) 
print(parsed_email["header"]) 

但如果字段不存在什么?:

raw_email = b"\n".join(Mailbox.retr(i+1)[1]) 
parsed_email = email.message_from_bytes(raw_email) 
print(parsed_email["non-existent field"]) 

上面的代码将返回None,而不是抛出一个KeyError