Objective-C HTML解析。获取标签之间的所有文本

我使用hpple尝试从ThePirateBay获取洪流描述。目前，我正在使用此代码：Objective-C HTML解析。获取标签之间的所有文本

NSString *path = @"//div[@id='content']/div[@id='main-content']/div/div[@id='detailsouterframe']/div[@id='detailsframe']/div[@id='details']/div[@class='nfo']/pre/node()"; 
NSArray *nodes = [parser searchWithXPathQuery:path]; 
for (TFHppleElement * element in nodes) { 
    NSString *postid = [element content]; 
    if (postid) { 
     [texts appendString:postid]; 
    } 
}

这只返回纯文本，而不是任何URL的截图。无论如何要获取所有链接和其他标签，而不仅仅是纯文本？的piratebay是fomratted像这样：

<pre> 
    <a href="http://img689.imageshack.us/img689/8292/itskindofafunnystory201.jpg" rel="nofollow"> 
    http://img689.imageshack.us/img689/8292/itskindofafunnystory201.jpg</a> 
More texts about the file 
</pre>

来源

2013-05-04 user2272641

您是否尝试过使用'@“// DIV [@ ID = '内容'] /格[@ ID = '主内容']/DIV/DIV [@ ID = 'detailsouterframe'] /格[@ ID = 'detailsframe'] /格[@ ID = '细节'] /格[@类= 'NFO'] /前/文本（）'？它有帮助吗？ – HAS 2013-05-04 17:56:14

它返回相同的东西 – user2272641 2013-05-04 20:23:18

然后请更多的HTML或一个链接，因为你给我的代码片段我得到了它的工作......我用'@“/ /前/文本（）”'的片段。我认为你的路径有其他错误 – HAS 2013-05-04 20:40:32

这是一个很容易的工作，你这样做是正确的差不多！

你想要的是a -tag的内容（或属性），所以你需要告诉解析器你想要它。

只要改变你的XPath到

@"//div[@id='content']/div[@id='main-content']/div/div[@id='detailsouterframe']/div[@id='detailsframe']/div[@id='details']/div[@class='nfo']/pre/a"

（你在最后错过了a，你不需要node()）

输出：

http://www.imdb.com/title/tt1904996/
http://leetleech.org/images/65823608764828593230.png
http://leetleech.org/images/44748070481477652927.png
http://leetleech.org/images/42024611449329122742.png

如果你只是想截图的网址，你可以这样做

NSMutableArray *screenshotURLs = [[NSMutableArray alloc] initWithCapacity:0]; 
for (int i = 1; i < nodes.count; i++) { 
    [screenshotURLs addObject:nodes[i]]; 
}

来源

2013-05-04 21:09:28 HAS

它的工作原理！谢谢！ – user2272641 2013-05-04 21:22:04

不客气:) – HAS 2013-05-04 21:24:09

Objective-C HTML解析。获取标签之间的所有文本

回答

相关问题