2017-04-18 18 views
0

背景正则表达式表达不

目前有一个控制台应用程序,从0365 Outlook帐户获取电子邮件,我使用Outlook API 2.0

从HTML字符串删除内联CSS问题

我正在使用api访问电子邮件的正文,但正文是以html字符串形式出现的。我正在使用我的去正则表达式功能,它将删除html标记,但outlook添加一个CSS类到他们的Html基本上使我的正则表达式过时。

代码

string body = "<html> 
<head> 
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 
<meta content="text/html; charset=us-ascii"> 
<meta name="Generator" content="Microsoft Word 15 (filtered medium)"> 
<style> 
<!-- 
@font-face 
    {font-family:"Cambria Math"} 
@font-face 
    {font-family:Calibri} 
p.MsoNormal, li.MsoNormal, div.MsoNormal 
    {margin:0in; 
    margin-bottom:.0001pt; 
    font-size:11.0pt; 
    font-family:"Calibri",sans-serif} 
a:link, span.MsoHyperlink 
    {color:#0563C1; 
    text-decoration:underline} 
a:visited, span.MsoHyperlinkFollowed 
    {color:#954F72; 
    text-decoration:underline} 
span.EmailStyle17 
    {font-family:"Calibri",sans-serif; 
    color:windowtext} 
.MsoChpDefault 
    {font-family:"Calibri",sans-serif} 
@page WordSection1 
    {margin:1.0in 1.0in 1.0in 1.0in} 
div.WordSection1 
    {} 
--> 
</style> 
</head> 
<body lang="EN-US" link="#0563C1" vlink="#954F72"> 
<div class="WordSection1"> 
<p class="MsoNormal">&nbsp;</p> 
</div> 
<hr> 
<p><b>Confidentiality Notice:</b> This e-mail is intended only for the addressee named above. It contains information that is privileged, confidential or otherwise protected from use and disclosure. If you are not the intended recipient, you are hereby notified 
that any review, disclosure, copying, or dissemination of this transmission, or taking of any action in reliance on its contents, or other use is strictly prohibited. If you have received this transmission in error, please reply to the sender listed above 
immediately and permanently delete this message from your inbox. Thank you for your cooperation.</p> 
</body> 
</html> 
"; 
string viewString1 = Regex.Replace(body, "<.*?>", string.Empty); 
string viewString12 = viewString1.Replace("&nbsp;", string.Empty); 
从我的正则表达式

<!-- 
@font-face 
    {font-family:"Cambria Math"} 
@font-face 
    {font-family:Calibri} 
p.MsoNormal, li.MsoNormal, div.MsoNormal 
    {margin:0in; 
    margin-bottom:.0001pt; 
    font-size:11.0pt; 
    font-family:"Calibri",sans-serif} 
a:link, span.MsoHyperlink 
    {color:#0563C1; 
    text-decoration:underline} 
a:visited, span.MsoHyperlinkFollowed 
    {color:#954F72; 
    text-decoration:underline} 
span.EmailStyle17 
    {font-family:"Calibri",sans-serif; 
    color:windowtext} 
.MsoChpDefault 
    {font-family:"Calibri",sans-serif} 
@page WordSection1 
    {margin:1.0in 1.0in 1.0in 1.0in} 
div.WordSection1 
    {} 
--> 







Confidentiality Notice: This e-mail is intended only for the addressee named above. It contains information that is privileged, confidential or otherwise protected from use and disclosure. If you are not the intended recipient, you are hereby notified 
that any review, disclosure, copying, or dissemination of this transmission, or taking of any action in reliance on its contents, or other use is strictly prohibited. If you have received this transmission in error, please reply to the sender listed above 
immediately and permanently delete this message from your inbox. Thank you for your cooperation. 

目的

结果

我需要从字符串能够带html标签,并且还删除出来的css类寻找身体的地方。

+1

顺便说一句,你可能要考虑更换 为空(白)的空间,这是它代表的(不是空的)。 – JuanR

回答

3

您可以String.Emptyregex optionSingleline替换<!--.*?-->(使.匹配新行)

string viewString1 = Regex.Replace(body, "<.*?>", string.Empty, RegexOptions.Singleline);