2015-12-12 184 views
1

我以在HTML敏捷包刺伤并无法找到正确的方式去了解这个内容。 例如: 我想第二span标签的内容:获得第二span标签

htmlDoc.DocumentNode.SelectSingleNode("//div[@style='color:#000000; padding: 10px;']/table/tr[1]/td[1]/span[2]").InnerText; 

这里是我想使用HTML敏捷性包解析我的html文件:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml"> 
<head> 
</head> 
<body onload="oload()" onunload="Unload()"> 

<div id="content"> 
<table width="100%"> 
<tr> 
    <td width="48%" valign="top"> 
<fieldset style="border:1px solid #ccc;color:#ccc;margin:0;padding:0;"> 
<legend style="color:#ccc;margin:0 0 0 10px;padding:0 3px;">Profile Information</legend> 
<div style="color:#000000; padding: 10px;"> 
<br /> 
Name Surname:<br /> 
<span style="font-size:18px;">John Doe</span> 
<br /><br /><br /> 
Address:<br /> 
<span style="font-size:18px;">706 test<br>NY 14013</span> 
<br /><br /><br /> 
</div> 
</fieldset> 
<br /> 
</td> 
    <td width="52%" align="right" valign="top"> 
</td> 
</tr> 
</table> 
    </div> 
</body> 
</html> 

回答

0

根据HTML片段贴,所有span元素,包括目标span[2]直接的div内,所以正确的XPath是简单的:

//div[@style='color:#000000; padding: 10px;']/span[2] 

在线演示链接:https://dotnetfiddle.net/mRfLEQ

输出:

706 testNY 14013 
0

试试这个:

using System; 
using System.Collections.Generic; 
using System.Linq; 
using System.Text; 
using HtmlAgilityPack; 

namespace ConsoleApplication1 
{ 
    class Program 
    { 
     static void Main(string[] args) 
     { 
      String html = @"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd""> 
<html xmlns=""http://www.w3.org/1999/xhtml""> 
<head> 
</head> 
<body onload=""oload()"" onunload=""Unload()""> 

<div id=""content""> 
<table width=""100%""> 
<tr> 
    <td width=""48%"" valign=""top""> 
<fieldset style=""border:1px solid #ccc;color:#ccc;margin:0;padding:0;""> 
<legend style=""color:#ccc;margin:0 0 0 10px;padding:0 3px;"">Profile Information</legend> 
<div style=""color:#000000; padding: 10px;""> 
<br /> 
Name Surname:<br /> 
<span style=""font-size:18px;"">John Doe</span> 
<br /><br /><br /> 
Address:<br /> 
<span style=""font-size:18px;"">706 test<br>NY 14013</span> 
<br /><br /><br /> 
</div> 
</fieldset> 
<br /> 
</td> 
    <td width=""52%"" align=""right"" valign=""top""> 
</td> 
</tr> 
</table> 
    </div> 
</body> 
</html>"; 

      var doc = new HtmlAgilityPack.HtmlDocument(); 
      doc.LoadHtml(html); 
      var spans = doc.DocumentNode.SelectNodes("//span"); 
      Console.WriteLine(spans[1].InnerText); 
     } 
    } 
} 

基本上,​​会给所有span节点,并使用索引显示第二的innerText