在HtmlAgilityPack中获取其他元素的特定元素在C＃

我正在一个项目中，我需要分析很多html文件。我需要从一个<div class="story-body">在HtmlAgilityPack中获取其他元素的特定元素在C＃

每个<p>到目前为止，我有这个代码，它做我想做的，但我想知道如何使用xpath表达式来做到这一点。我试过这个：

textBody.SelectNodes ("What to put here? I tried //p but it gives every p in document not inside the one div")

但是没有成功。有任何想法吗？

public void Parse(){ 
    HtmlNode title = doc.DocumentNode.SelectSingleNode ("//h1[(@class='story-header')]"); 
    HtmlNode textBody = doc.DocumentNode.SelectSingleNode ("//div[(@class='story-body')]"); 

    XmlText textT; 
    XmlText textS; 

    string story = ""; 

    if(title != null){ 
    textT = xmlDoc.CreateTextNode(title.InnerText); 
    titleElement.AppendChild(textT); 
    Console.WriteLine(title.InnerText); 
    } 

    foreach (HtmlNode node in textBody.ChildNodes) { 
     if(node.Name == "p" || (node.Name == "span" && node.GetAttributeValue("class", "class") == "cross-head")){ 
     story += node.InnerText + "\n\n"; 
     Console.WriteLine(node.InnerText); 
     } 
    } 

    textS = xmlDoc.CreateTextNode (story); 

    storyElement.AppendChild (textS); 

    try 
    { 
     xmlDoc.Save("test.xml");    
    } 
    catch (Exception e) 
    { 
     Console.WriteLine(e.Message); 
    } 
}

来源

2013-05-05 Jan

这是一个相当简单的事情，你只需要一个.添加到字符串像.//p，这样你只能得到当前节点的子节点。

另一种方法是只是调用的SelectNodes这样的：

doc.DocumentNode.SelectNodes("//div[(@class='story-body')]/p");

来源

2013-05-06 00:26:51 shriek

谢谢你，你是对的很简单。但是，我结束了我的原始方法，因为我必须检查更多的东西，我不认为它可以用xpath实现 – Jan 2013-05-06 16:48:27

在HtmlAgilityPack中获取其他元素的特定元素在C＃

回答

相关问题