XPath来获取价格在亚马逊

这里的URL第一：XPath来获取价格在亚马逊

http://www.amazon.in/gp/product/B00EYCBFDQ/ref=s9_pop_gw_g147_i3?pf_rd_m=A1VBAL9TL5WCBF&pf_rd_s=center-3&pf_rd_r=1YP3T548XBFHJ1RA3EH8&pf_rd_t=101&pf_rd_p=402518447&pf_rd_i=1320006031

以上是链接到一些产品页面上www.amazon.in.I想要得到的实际价格是Rs.4,094。下面是一个试图打印价格的Python代码，我使用//span[@id="actualPriceValue"]/text()来获得价格，但它返回一个空的列表。任何人都可以建议如何获得价格？

from lxml import html 
import requests 

page = requests.get('http://www.amazon.in/gp/product/B00EYCBFDQ/ref=s9_pop_gw_g147_i3?pf_rd_m=A1VBAL9TL5WCBF&pf_rd_s=center-3&pf_rd_r=1YP3T548XBFHJ1RA3EH8&pf_rd_t=101&pf_rd_p=402518447&pf_rd_i=1320006031') 
tree = html.fromstring(page.text) 
price = tree.xpath('//span[@id="actualPriceValue"]/text()') 

print price

来源

2014-03-26 user3438081

使用以下XPath：

price = tree.xpath("//*[@id='actualPriceValue']/b/span/text()")[0]

以下代码检出：

from lxml import html 
import requests 

page = requests.get('http://www.amazon.in/gp/product/B00EYCBFDQ/ref=s9_pop_gw_g147_i3?pf_rd_m=A1VBAL9TL5WCBF&pf_rd_s=center-3&pf_rd_r=1YP3T548XBFHJ1RA3EH8&pf_rd_t=101&pf_rd_p=402518447&pf_rd_i=1320006031') 
tree = html.fromstring(page.text) 
price = tree.xpath("//*[@id='actualPriceValue']/b/span/text()")[0] 

print price

结果：

4,094.00 
[Finished in 3.0s]

让我们知道这会有所帮助。

来源

2014-03-26 19:02:03 Manhattan

是的，它的工作非常感谢你 – user3438081

不客气，祝你好运。 – Manhattan

我认为这个问题是，span ID为actualPriceValue没有直接的文本。你会想做这样的事情（我把它拉出我的头，所以你可能不得不改变它）：

编辑：固定。以下说明仍然准确。

//*[@id='actualPriceValue']/b/span/text()

你会注意到HTML看起来像这样：

<span id="actualPriceValue"> 
    <b class="priceLarge"> 
     <span style="text-decoration: inherit; white-space: nowrap;"> 
      <span class="currencyINR">&nbsp;&nbsp;</span> 
      <span class="currencyINRFallback" style="display:none">Rs. </span> 
      4,112.00 
     </span> 
    </b> 
</span>

你会发现，它应该是：

Span with an id of actualPriceValue -> first b element -> first span element -> text

来源

2014-03-26 18:55:23 Clete2

我仍然收到一个空的列表。 – user3438081

是的，我在语法上有点偏离（缺少*并且不需要[0]）。我看到上面的答案是相似的，但并不缺乏语法！ :) – Clete2

XPath来获取价格在亚马逊

回答

相关问题