2013-03-20 103 views
1

希望能从你那里得到一些答案。解析htmlagilitypack(没有id的表)vb.net

我用vb.net和htmlagilitypack来获取数据和它的作品,但没有办法,我希望它=)

我有这样的HTML页面的(部分):


<TABLE WITH=100% BORDER=4> 

<TR> 
<TH><A HREF="http:/cgi-bin/vplata.py?tgnr=4300&val=Visa+T%C3%A5gnummer&Bek=Visa&sort=Lok" >Lok</A></TH> 
<TH><A HREF="http:/cgi-bin/vplata.py?tgnr=4300&val=Visa+T%C3%A5gnummer&Bek=Visa&sort=Avg" >Avg&aring;r</A></TH> 
<TH><A HREF="http:/cgi-bin/vplata.py?tgnr=4300&val=Visa+T%C3%A5gnummer&Bek=Visa&sort=AvgS" >Station</A></TH> 
<TH><A HREF="http:/cgi-bin/vplata.py?tgnr=4300&val=Visa+T%C3%A5gnummer&Bek=Visa&sort=Ank" >Ankommer</A></TH> 
<TH><A HREF="http:/cgi-bin/vplata.py?tgnr=4300&val=Visa+T%C3%A5gnummer&Bek=Visa&sort=AnkS" >Station</A></TH> 
<TH>Tjänstetyp</TH> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1176&val=Visa+Lokindivid&Bek=Visa">R1176</a></TD> 
<TD>Mar-20-2013 13:04:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=HBGB&val=Visa+Driftplats&Bek=Visa">HBGB</A></TD> 
<TD>Mar-20-2013 21:21:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>B1</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1267&val=Visa+Lokindivid&Bek=Visa">R1267</a></TD> 
<TD>Mar-20-2013 13:04:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=HBGB&val=Visa+Driftplats&Bek=Visa">HBGB</A></TD> 
<TD>Mar-20-2013 21:21:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>B2</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1267&val=Visa+Lokindivid&Bek=Visa">R1267</a></TD> 
<TD>Mar-20-2013 22:05:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>Mar-20-2013 22:28:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=KB%&val=Visa+Driftplats&Bek=Visa">KBÄ</A></TD> 
<TD>D1</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1281&val=Visa+Lokindivid&Bek=Visa">R1281</a></TD> 
<TD>Mar-21-2013 13:04:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=HBGB&val=Visa+Driftplats&Bek=Visa">HBGB</A></TD> 
<TD>Mar-21-2013 21:21:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>D1</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1281&val=Visa+Lokindivid&Bek=Visa">R1281</a></TD> 
<TD>Mar-21-2013 22:05:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>Mar-21-2013 22:28:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=KB%&val=Visa+Driftplats&Bek=Visa">KBÄ</A></TD> 
<TD>B2</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=RXXXXX&val=Visa+Lokindivid&Bek=Visa">RXXXXX</a></TD> 
<TD>Mar-21-2013 22:05:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>Mar-21-2013 22:28:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=KB%&val=Visa+Driftplats&Bek=Visa">KBÄ</A></TD> 
<TD>B1\B2</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1281&val=Visa+Lokindivid&Bek=Visa">R1281</a></TD> 
<TD>Mar-25-2013 13:04:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=HBGB&val=Visa+Driftplats&Bek=Visa">HBGB</A></TD> 
<TD>Mar-25-2013 21:21:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>D1</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1281&val=Visa+Lokindivid&Bek=Visa">R1281</a></TD> 
<TD>Mar-25-2013 22:05:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>Mar-25-2013 22:28:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=KB%&val=Visa+Driftplats&Bek=Visa">KBÄ</A></TD> 
<TD>D1</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=R1254&val=Visa+Lokindivid&Bek=Visa">R1254</a></TD> 
<TD>Mar-27-2013 13:04:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=HBGB&val=Visa+Driftplats&Bek=Visa">HBGB</A></TD> 
<TD>Mar-27-2013 21:21:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>B2</TD> 
</TR> 
<TR> 
<TD><a HREF="/cgi-bin/vplata.py?individ=RXXXXX&val=Visa+Lokindivid&Bek=Visa">RXXXXX</a></TD> 
<TD>Mar-27-2013 13:04:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=HBGB&val=Visa+Driftplats&Bek=Visa">HBGB</A></TD> 
<TD>Mar-27-2013 21:21:00</TD> 
<TD><A HREF="/cgi-bin/vplata.py?stn=ET3&val=Visa+Driftplats&Bek=Visa">ET3</A></TD> 
<TD>B1\B2</TD> 
</TR> 
</TABLE> 
<A><A>Senast uppdaterad: Mar-20-2013 18:16:00</A><BR> 
<table width="100%" cellpadding="0" cellspacing="0" border="0"> 
<TR> 
<TD width="20%" bgcolor="#009900" align="left"> 
<IMG src="http://litmgc101.greencargo.com/bottenbild.jpg" alt="Green Cargo" width=800 height=25 border=0> 
</TD> 
</TR> 
<TR> 
</table> 

我想要做的是拿(例如)“R1176”和日期“Mar-20-2013 13:04:00”的部分。 (宁愿没有时间“13:04:00”),但如果我在解析阶段不能跳过它,我可以在以后的VB.net中删除它。

所以简单地解释我想要做的是: 获取所有“R1234”和它的日期,然后把它放在一个文本框的“R4321”和另一个文本框的日期或一些东西。

回答

0

在C#中我会做这样的事情:

var result = 
    doc.DocumentNode.SelectNodes("//td/a[contains(@href,'Lokindivid')]") 
     .Select(node => new KeyValuePair<string, DateTime>(node.InnerText, DateTime.Parse(node.SelectSingleNode("./ancestor::tr[1]/td[2]").InnerText).Date)); 

我VB.NET FOO导致下面的代码(这是一个直译),其与样本HTML工作你提供:

Dim doc As New HtmlDocument 
doc.LoadHtml(Content.Html) 

Dim items = doc.DocumentNode.SelectNodes("//td/a[contains(@href,'Lokindivid')]").Select(Function(node) New KeyValuePair(Of String, DateTime)(node.InnerText, DateTime.Parse(node.SelectSingleNode("./ancestor::tr[1]/td[2]").InnerText).Date)) 

For Each item As KeyValuePair(Of String, Date) In items 
    Console.WriteLine(item.Key) 
    Console.WriteLine(item.Value) 
Next 
+0

它没有与vb.net代码很好地工作(我认为,我无法翻译它)但我试过这个:'Dim specs = doc.DocumentNode.SelectNodes(“// a [contains(@ href,'Lokindivid')]“)' – 2013-03-21 10:53:38

+0

但我无法弄清楚如何为每个”lokindivid“拍摄日期。 =( – 2013-03-21 10:59:59

+0

我尝试了你的新的更新代码,这对我不起作用,现在我越来越绝望=(|我得到了很多错误...想想我必须找到另一种方法从页面中提取这些信息。现在我使用下面的代码: Dim specs = doc.DocumentNode.SelectNodes(“// td/a [contains(@ href,'Lokindivid')]”) 对于每个spec2作为HtmlNode在specs2 Dim value = spec2.InnerText.Trim() 如果不String.IsNullOrEmpty(值),然后 RichTextBox2.Text = RichTextBox2.Text和值 否则 结束如果 接下来 结束Using' – 2013-03-22 10:01:57