2010-09-10 46 views
2

我尝试检索网页上的这段文字不换行:删除htmlagilitypack中的linebreak节点?

<span class="listingTitle">888-I-AM-JUNK. Canada's most trusted BIG LOAD junk removal<br />specialist!</span></a> 

我该怎么办呢?

这是我目前的代码,即时通讯使用vb。

Dim content As String = "" 
     Dim doc As New HtmlAgilityPack.HtmlDocument() 
     doc.Load(WebBrowser1.DocumentStream) 
     Dim hnc As HtmlAgilityPack.HtmlNodeCollection = doc.DocumentNode.SelectNodes("//span[@class='listingTitle']") 
     For Each link As HtmlAgilityPack.HtmlNode In hnc 
      Dim replaceUnwanted As String = "" 
      replaceUnwanted = link.InnerText.Replace("&amp;", "&") ' 
      replaceUnwanted = replaceUnwanted.Replace("&#39;", "'") 
      replaceUnwanted = replaceUnwanted.Replace("See full business details", "") 

      content &= replaceUnwanted & vbNewLine 
     Next 
     RichTextBox1.Text = content 
     Me.RichTextBox1.Lines = Me.RichTextBox1.Text.Split(New Char() {ControlChars.Lf}, _ 
                StringSplitOptions.RemoveEmptyEntries) 

我需要删除<br />

回答

0

如何通过相同的普通字符串操作回事?

replaceUnwanted = replaceUnwanted.Replace(vbCrLf, "") 

如果你处理的<span>...<span>

replaceUnwanted = replaceUnwanted.ToLower().Replace("<br>", "") 
replaceUnwanted = replaceUnwanted.ToLower().Replace("<br />", "") 
+0

由于一吨p.cambell “replaceUnwanted = replaceUnwanted.ToLower()更换(vbCrLf, ”“)” 的伎俩。我不知道我怎么没有想到这一点。 – Datadayne 2010-09-11 00:08:22

+0

@Datadayne:你打赌,我的荣幸。很显然,toLower()并不真正为vbCrLf案例购买任何东西,但我只是从BR示例中复制/粘贴。我编辑只是为了好玩。这是你的问题upvote! – 2010-09-11 00:42:35