HtmlAgilityPack出现问题

我找不出什么问题。我只是创建这个项目来测试HtmlAgilityPack和我所拥有的。HtmlAgilityPack出现问题

using System; 
using System.Collections.Generic; 
using System.Text; 
using HtmlAgilityPack; 


namespace parseHabra 
{ 
    class Program 
    { 
     static void Main(string[] args) 
     { 
      HTTP net = new HTTP(); //some http wraper 
      string result = net.MakeRequest("http://stackoverflow.com/", null); 
      HtmlDocument doc = new HtmlDocument(); 
      doc.LoadHtml(result); 

      //Get all summary blocks 
      HtmlNodeCollection news = doc.DocumentNode.SelectNodes("//div[@class=\"summary\"]"); 
      foreach (HtmlNode item in news) 
      { 
       string title = String.Empty; 
       //trouble is here for each element item i get the same value 
       //all the time 
       title = item.SelectSingleNode("//a[@class=\"question-hyperlink\"]").InnerText.Trim(); 
       Console.WriteLine(title); 
      } 
      Console.ReadLine(); 
     } 
    } 
}

它看起来像我使xpath不是我选择的每个节点，而是整个文档。任何建议为什么这样呢？ Thx提前。

来源

2012-01-28 gingray

为什么不使用'HtmlWeb'直接下载HTML？ – Oded 2012-01-28 17:43:24

对于这个问题它并不重要 – gingray 2012-01-28 17:47:33

我还没有试过你的代码，但从快速查看我怀疑问题是，//正在从整个文档的根目录搜索，而不是当前元素的根，因为我猜你正在期待。

尝试把一个.前//

".//a[@class=\"question-hyperlink\"]"

来源

2012-01-28 17:52:30

但是，它是如何可能的，所选节点中的内容只是htmldocument的一部分或不是？ – gingray 2012-01-28 18:07:51

@imbriarius，你试过了吗？ – 2012-01-28 19:09:25

@imbriarius，当你选择一个节点时，它不与文档的其余部分隔离，因此'//'仍然相对于整个文档。使用'.//'表示从这里开始的任何地方，'.'选择当前节点。 – 2012-01-29 07:32:21

我已经重写你的XPath作为一个单一的查询，找到所有问题的标题，而不是寻找摘要然后的标题。克里斯的答案指出了可能容易避免的问题。

var web = new HtmlWeb(); 
var doc = web.Load("http://stackoverflow.com"); 

var xpath = "//div[starts-with(@id,'question-summary-')]//a[@class='question-hyperlink']"; 

var questionTitles = doc.DocumentNode 
    .SelectNodes(xpath) 
    .Select(a => a.InnerText.Trim());

来源

2012-01-28 19:10:51

对于我来说，了解图书馆行为更重要 – gingray 2012-01-28 22:10:25

HtmlAgilityPack出现问题

回答

相关问题