从选择器中删除子节点

我在scrapy中创建了一个项目，我从网页中刮取（显然！）特定数据。从选择器中删除子节点

items = sel.xpath('//div[@class="productTiles cf"]/ul').extract() 
    for item in items: 
      price = sel.xpath('//ul/li[@class="productPrice"]/span/span[@class="salePrice"]').extract() 
      print price

这将产品以下结果：

u'<span class="salePrice">$20.43\xa0<span class="reducedFrom">$40.95</span></span>',  
u'<span class="salePrice">$20.93\xa0<span class="reducedFrom">$40.95</span></span>

我需要得到公正是salePrice，例如20.43和20.93，而忽略其他标签和其余数据。任何帮助在这里将不胜感激。

来源

2014-03-12 Edward

貌似解决方案如下：

//ul/li[@class="productPrice"]/span/span[@class="salePrice"]//text()

它会抓住我正在寻找正确的元素的只是文本，就像这样：

u'$20.43\xa0', u'$20.93\xa0'

现在就可以解析它，以消除最后的不必要的垃圾，我就定了。如果有人有更优雅的解决方案，我很乐意看到它。

来源

2014-03-12 06:14:13 Edward

span[@class="salePrice"]返回span与其子女。

这应该得到的只是顶部span全文：

sel.xpath('//ul/li[@class="productPrice"]/span/span[@class="salePrice"]/text()').extract()[0]

来源

2014-03-12 06:15:04 warvariuc

从选择器中删除子节点

回答

相关问题