0
我尝试从HTML中提取数据特定关键字的所有div( - > XML)-document(下)使用HTML敏捷性包。包含“id=dealId_*****
”的“div
”是相关的。我想我知道如何继续,当我弄清楚如何计算所有“div
”与“id=dealId_*****
”。我尝试使用XPath的方法“starts-with
”,但没有奏效:提取包含使用XPath/HTML敏捷性包
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(Sourcecode);
int numberOfDIVs;
numberOfDIVs = doc.DocumentNode.SelectNodes("//*[@id='jLocalDeals']/*[starts-with(@id, 'dealId_']").Count;
<div id="jLocalDeals" class="dealsBlock" style="">
<h1>
<div id="dealId_5474417" class="jDeal LEISURE_OFFERS">
<div id="dealId_5476688" class="jDeal SHOPPING">
<div id="dealId_5445019" class="jDeal TICKETS1 RESTAURANT1">
<div class="wrapper3Deals"></div>
<div id="dealId_5474286" class="jDeal BEAUTY">
<div id="dealId_5476685" class="jDeal LEISURE_OFFERS">
<div id="dealId_5474466" class="jDeal SERVICES">
<div class="wrapper3Deals"></div>
<div id="dealId_5466810" class="jDeal BEAUTY">
<div id="dealId_5425417" class="jDeal SERVICES">
<div id="dealId_5474329" class="jDeal SHOPPING">
<div class="wrapper3Deals"></div>
<div id="dealId_5476703" class="jDeal SHOPPING">
<div id="dealId_5476729" class="jDeal SHOPPING">
<div id="dealId_5474702" class="jDeal HEALTHCARE">
<div class="wrapper3Deals"></div>
<div id="dealId_5444044" class="jDeal TRAVEL1" style="display: block;">
<div id="dealId_5474444" class="jDeal LEISURE_OFFERS" style="display: block;">
<div id="dealId_5473774" class="jDeal TRAVEL1" style="display: block;">
<div class="wrapper3Deals"></div>
</div>
P.S:可惜我只能够使用.NET 2.0。
感谢,但也不能正常工作(的NullReferenceException):(这是网页,我试图刮:?http://www.groupon.de/alle-deals/aachen你是什么意思与>路径中的使用计数() – think
有一个XPath功能count()返回节点的数量,与使用HtmlDocument.Count()并没有什么不同,只是指出它,所以你会知道的。当你仅仅运行// div [@ id ='jLocalDeals']? – JWiley
只需运行// DIV [@ ID = 'jLocalDeals']我回来NumberofDIVs = 1而这个(http://www.imgbox.de/users/public/images/1dMjr3WXhI.JPG)。我希望你能帮助我。 – think