如何获得jsoup链接中的文本？

我正在使用jsoup解析一个html页面。这里是我做了什么至今：如何获得jsoup链接中的文本？

doc = Jsoup.connect("http://www.marketimyilmazlar.com/index.php?route=product/category&path=141_77").get(); 

Element page_clips = doc.getElementById("page_clips"); 

Element page_clip_content = page_clips.getElementById("content"); 
Elements allProductPricesOnPage = page_clip_content.getElementsByClass("price");

现在，当我写：

allProductNamesOnPage.get(0);

返回我下面的：

<div class="name"> 
<a href="http://www.marketimyilmazlar.com/index.php? 
route=product/product&amp;path=141_77&amp;product_id=4309"> here is the text</a> 
</div>

我想要做的是，我想获取该对象的“这里是文本”部分。任何人都可以帮助我吗？

感谢

来源

2014-02-07 yrazlik

您可能要遍历Elements你有收集并打印他们的价格一一：

Elements allProductPricesOnPage = page_clip_content 
       .getElementsByClass("price"); 
for (Element el : allProductPricesOnPage) { 
    System.out.println(el.text()); 
}

给人，

19.99 TL KDV Dahil 
9.99 TL KDV Dahil 
14.99 TL KDV Dahil

它是做什么的，你选择Elements其实施Iterator（见javadoc here），它可以让你访问您的集合中的个人Element对象。

这些Element中的每一个在您的HTML中重复的对象都有您想要提取的相关信息。

来源

2014-02-07 18:17:15 PopoFibo

如果你想只提取文本，你可以调用text()方法：

String text = allProductNamesOnPage.get(0).text();

这种方法获取元素及其结合孩子们的文字。所以，如果你想确保你只从一个元素中提取文本，调用text()第一个子元素：

String text = allProductNamesOnPage.get(0).child(0).text();

在这里看到：http://jsoup.org/cookbook/extracting-data/attributes-text-html

来源

2014-02-07 14:40:57 ashatte

如何获得jsoup链接中的文本？

回答

相关问题