我的目标是使用RegEx扫描电子邮件中的单词“trade”,然后打印找到的整行文本。如何从HTML文件打印一行文本
我已经成功使用RegEx从这个HTML文档中捕获其他数据(如物种,重量,价格等),并成功识别单词“trade”,但我失败了打印整个行。我确实尝试过使用BeautifulSoup来实现这个目标,但这样做的难度更大。
理想我想捕捉并打印单词“交易”被发现了两行。这里是我使用的尝试识别“贸易”,并打印出来的就行了代码:
with open(file_path, 'r') as f:
email = f.read()
pattern = re.search(r'\btrade\b',email).group(0)
match = re.search(r'\btrade\b', email)
if match:
for line in email:
print("TRADE STUFF:",line)
请注意,我已经尝试了各种方法,如print("TRADE STUFF:", line.splitlines())
以及print("TRADE STUF:", line.stripped_strings)
但是没有成功。
感谢您的任何帮助。
HTML代码:
<html>
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>FW: NEFS 5 Available Fish</title>
<link rel="important stylesheet" href="">
<style>div.headerdisplayname {font-weight:bold;}</style></head>
<body>
<table border=0 cellspacing=0 cellpadding=0 width="100%" class="header-part1"><tr><td><b>Subject: </b>FW: NEFS 5 Available Fish</td></tr><tr><td><b>From: </b>Claire Fitz-Gerald <[email protected]></td></tr><tr><td><b>Date: </b>9/5/2014 9:52 AM</td></tr></table><br>
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><META HTTP-EQUIV="Content-Type" CONTENT="text/html; "><meta name=Generator content="Microsoft Word 12 (filtered medium)"><!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:"Franklin Gothic Book";
panose-1:2 11 5 3 2 1 2 2 2 4;}
@font-face
{font-family:"Franklin Gothic Demi";
panose-1:2 11 7 3 2 1 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1512259006;
mso-list-template-ids:-893643712;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Apologies for the delay in distributing this listing. It got lost in my inbox.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Please see the below quota listings.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Thanks,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><div><p class=MsoNormal><span style='font-family:"Franklin Gothic Book","sans-serif";color:#1F497D'>Claire Fitz-Gerald<o:p></o:p></span></p><p class=MsoNormal><i><span style='font-size:10.0pt;font-family:"Franklin Gothic Book","sans-serif";color:#1F497D'><o:p> </o:p></span></i></p><p class=MsoNormal><b><span style='font-size:11.0pt;font-family:"Franklin Gothic Demi","sans-serif";color:#002776'>Cape Cod Commercial Fishermen's Alliance<o:p></o:p></span></b></p><p class=MsoNormal><b><span style='font-size:11.0pt;font-family:"Franklin Gothic Book","sans-serif";color:#DE3500'>~ Small Boats. Big Ideas. ~</span></b><b><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#DE3500'><o:p></o:p></span></b></p></div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> NEFS V [mailto:[email protected]] <br><b>Sent:</b> Monday, September 01, 2014 8:46 PM<br><b>To:</b> mike walsh - 6; NEFS 11 & 12 - Josh Wiersma; NEFS 13 John Haran; NEFS 2 - Dave Leveille; NEFS 3 - Rob Banks; NEFS 6 & 10 Jim Reardon; NEFS 7 & 8 - Linda MaCann; NEFS 9 - Stephanie Rafael-DeMello; paula lynch - 10; Claire Fitz-Gerald; Sector - MCCS; Sector - NCCS; Sector - Sustainable Harvest; tory bramante- 6<br><b>Subject:</b> NEFS 5 Available Fish<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>All,<br>NEFS 5 has the following fish available for lease/trade:<o:p></o:p></p></div><div><ul type=disc><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>GB EAST cod: 954 lbs @ $0.83</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>GB EAST cod: 1,046 lbs to trade for 1,830 lbs GB WEST cod</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>GB blackback: 30,000 lbs @ $0.07</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>GOM blackback: 800 lbs @ $0.03</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>white hake: 6,322 lbs @ $0.13</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>pollock: 22,000 lbs @ $0.015</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>redfish: 14,000 lbs @ $0.015</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>GB yt: 1,873 lbs @ $1.13</span></strong><o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'><strong><span style='font-size:13.5pt'>GB yt: 5,127 lbs to trade for 10,254 lbs SNE yt</span></strong><o:p></o:p></li></ul><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>-- <o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div></div><div><p class=MsoNormal>Daniel Salerno, NEFS 5<o:p></o:p></p></div><div><p class=MsoNormal>C/O NESTCo.<o:p></o:p></p></div><div><p class=MsoNormal>55 State Street<o:p></o:p></p></div><div><p class=MsoNormal>Narragansett, RI 02882<o:p></o:p></p></div><div><p class=MsoNormal>401-932-0070<o:p></o:p></p></div><div><p class=MsoNormal>401-633-6539 (fax)<o:p></o:p></p></div><div><p class=MsoNormal><a href="mailto:[email protected]" target="_blank">[email protected]</a><o:p></o:p></p></div><div class=MsoNormal align=center style='text-align:center'></body></html>
</body>
</html>
你也应该分享HTML文件。 –
对不起,我总是忘记添加,现在我将添加它。 – theprowler