我将安然电子邮件数据集设置为一个文件夹,其中包含文本文件形式的电子邮件,并且我想提取这些电子邮件的“身体”部分如何从包含电子邮件的文本文件中提取正文[安然数据集]
问题是,发件人的电子邮件,收件人的电子邮件等字段由收件人:,发件人等指定: 但Body没有以任何标题开头,它只是在所有其他领域已被指定。
现在,一个文本文件可以包含许多机构(在电子邮件线程/对话的情况下)。 我想从这些文件中提取正文。可以使用javamail API,如果是的话,那么如何?它只是离线数据集,以我的硬盘驱动器中的文本文件的形式存在,而不是互联网上。
的文件就像这 -
Message-ID: <[email protected]>
Date: Fri, 7 Dec 2001 10:06:42 -0800 (PST)
From: [email protected]
To: [email protected]
Subject: RE: West Position
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: Dunton, Heather </O=ENRON/OU=NA/CN=RECIPIENTS/CN=HDUNTON>
X-To: Allen, Phillip K. </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Pallen>
X-cc:
X-bcc:
X-Folder: \Phillip_Allen_Jan2002_1\Allen, Phillip K.\Inbox
X-Origin: Allen-P
X-FileName: pallen (Non-Privileged).pst
Please let me know if you still need Curve Shift.
Thanks,
Heather
-----Original Message-----
From: \t Allen, Phillip K.
Sent: \t Friday, December 07, 2001 5:14 AM
To: \t Dunton, Heather
Subject: \t RE: West Position
Heather,
Did you attach the file to this email?
-----Original Message-----
From: \t Dunton, Heather
Sent: \t Wednesday, December 05, 2001 1:43 PM
To: \t Allen, Phillip K.; Belden, Tim
Subject: \t FW: West Position
Attached is the Delta position for 1/16, 1/30, 6/19, 7/13, 9/21
-----Original Message-----
From: \t Allen, Phillip K.
Sent: \t Wednesday, December 05, 2001 6:41 AM
To: \t Dunton, Heather
Subject: \t RE: West Position
Heather,
This is exactly what we need. Would it possible to add the prior day for each of the dates below to the pivot table. In order to validate the curve shift on the dates below we also need the prior days ending positions.
Thank you,
Phillip Allen
-----Original Message-----
From: \t Dunton, Heather
Sent: \t Tuesday, December 04, 2001 3:12 PM
To: \t Belden, Tim; Allen, Phillip K.
Cc: \t Driscoll, Michael M.
Subject: \t West Position
Attached is the Delta position for 1/18, 1/31, 6/20, 7/16, 9/24
<< File: west_delta_pos.xls >>
Let me know if you have any questions.
Heather
无法在发布时发表评论(需要50 xp)。 – 2014-10-08 22:18:52
叹息...对不起,我一直忘记那个规则... – ajb 2014-10-08 22:22:53
提供了示例文件,我想要的就是单独获取所有'body'。 (有5个机构,因为它是一个会话电子邮件) 我已经使用Javamail api并使用getContent()来提取主体,但它将整个主体(它在X-Filename之后开始两行)到文件末尾) – Shady23 2014-10-09 13:33:16