2016-04-29 127 views
-1

我想通过命令行来导出我的文件:出口蟒蛇数据csv文件

scrapy crawl tunisaianet -o save.csv -t csv 

但没有什么是happenning,任何帮助吗?

这里是我的代码:

import scrapy 
import csv 
from tfaw.items import TfawItem 


class TunisianetSpider(scrapy.Spider): 
    name = "tunisianet" 
    allowed_domains = ["tunisianet.com.tn"] 
    start_urls = [ 
     'http://www.tunisianet.com.tn/466-consoles-jeux/', 
    ] 

    def parse(self, response): 
     item = TfawItem() 
     data= [] 
     out = open('out.csv', 'a') 
     x = response.xpath('//*[contains(@class, "ajax_block_product")]') 
     for i in range(0, len(x)): 
      item['revendeur'] = response.xpath('//*[contains(@class, "center_block")]/h2/a/@href').re('tunisianet')[i] 
      item['produit'] = response.xpath('//*[contains(@class, "center_block")]/h2/a/text()').extract()[i] 
      item['url'] = response.xpath('//*[contains(@class, "center_block")]/h2/a/@href').extract()[i] 
      item['description'] = response.xpath('//*[contains(@class, "product_desc")]/a/text()').extract()[i] 
      item['prix'] = response.xpath('//*[contains(@class, "price")]/text()').extract()[i] 
      data = item['revendeur'], item['produit'], item['url'], item['description'], item['prix'] 
      yield data 
      out.write(str(data)) 
      out.write('\n') 
+0

你为什么要创建一个项目,然后通过把它唱给一个元组?如果你已经在命令行输出到csv,为什么你需要'out.csv'? – eLRuLL

+0

认为out.csv将覆盖数据每次我执行命令,但没关系 –

+0

没有[this](http://stackoverflow.com/questions/36902783/output-python-to-csv-regular/36903483# 36903483)有帮助吗? – eLRuLL

回答

1

我假设你得到这些错误:

ERROR: Spider must return Request, BaseItem, dict or None, got 'tuple' in <GET http://www.tunisianet.com.tn/466-consoles-jeux> 

其中特别说什么是错的,你是返回元组的项目,改变你的收益率码到:

... 
item['prix'] = response.xpath('//*[contains(@class, "price")]/text()').extract()[i] 
yield item