在Scrapy Xpath中的逗号逗号

我有这个网站：

<div class="sliderContent"> 
<p>some content, some other content</p> 
<p>some content, some other content</p> 
<p>some content, some other content</p> 
<p>some content, some other content</p> 
</div>

我的XPath：

item['Description'] = sel.xpath('div[@class="content"]/div/div[@class="sliderContent"]//p').extract()

我想逃离逗号<p>，并提取所有内容，保存的HTML。我尝试这样做：

def parse_dir_contents(self, response): 
     for sel in response.xpath('//div[@class="container"]'): 
     item = LuItem() 
     item['Description'] = sel.xpath('div[@class="content"]/div/div[@class="sliderContent"]//p').extract()[0].replace(',','\,') 
     yield item

这适用于第一<p>，很明显，但我怎样才能做到这一点对所有的<p>？

从python开始，非常感谢任何帮助！

来源

2016-02-02 jacquesseite

请加网站的网址。我认为你可以尝试一些这样的东西：>>> a ='一些内容，一些其他内容' >>> a.replace（'，'，'/'） '一些内容/一些其他内容' –

你的分析结果是一个列表，你在列表中[0]，你需要通过你的描述的整个列表仅修改第一个元素：

def parse_dir_contents(self, response): 
    for sel in response.xpath('//div[@class="container"]'): 
     item = LuItem() 
     item['Description'] = sel.xpath('div[@class="content"]/div/div[@class="sliderContent"]//p').extract() 
     item['Description'] = [ ''.join(field.split(',')) for field in item.get('Description', [])] 
     yield item

来源

2016-02-02 13:25:39 sergiuz

在Scrapy Xpath中的逗号逗号

回答

相关问题