scrapy爬虫在爬行时显示错误

我想抓取优惠券网站的优惠券，但是当我试图运行爬虫它显示error.Please帮助。谢谢。scrapy爬虫在爬行时显示错误

import scrapy 
from scrapy.http import Request 
from scrapy.selector import HtmlXPathSelector 
from scrapy.spider import BaseSpider 
class CuponationSpider(scrapy.spider): 
    name = "cupo" 
    allowed_domains = ["cuponation.in"] 
    start_urls = ["https://www.cuponation.in/firstcry-coupon#voucher"] 
    def parse(self, response): 
     all_items = [] 
     divs_action = response.xpath('//div[@class="action"]') 
     for div_action in divs_action: 
     item = VoucherItem() 
     span0 = div_action.xpath('./span[@data-voucher-id]')[0] 
     item['voucher_id'] = span0.xpath('./@data-voucher- 
        id').extract()[0] 
     item['code'] = span0.xpath('./span[@class="code- 
       field"]/text()').extract()[0] 
     all_items.append(item) 





    >**Output** ERROR 
File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open 
raise URLError(err)URLError: <urlopen error timed out> 
2017-07-25 16:36:59 [boto] ERROR: Unable to read instance data, giving 
up

来源

2017-07-14 abhi09sep

回答你的问题是在警告。不要使用scrapy.selector.HtmlXPathSelector使用scrapy.Selector – Neil

@Neil仍然没有解决的问题我也试过。 – abhi09sep

那现在的警告是什么？什么是错误？ – Neil

Comment: ... tell me the error where i am doing

删除所有import线，使用只有之一：
```
import scrapy 
```
好继承应该是：

您已经更改name和starturl，使用：

name = "cuponation" 
allowed_domains = ['cuponation.in'] 
start_urls = ['https://www.cuponation.in/firstcry-coupon']

您可以使用Python的2.7
抱歉无法2.7运行Scrapy。这可能是不同之处。
The 错误：无法读取实例数据，给出，告诉您没有收到来自给定URL的任何数据。也许你是黑名单。

Comment: URL is cuponation.in/firstcry-coupon#voucher

这是相同页面无需重新加载它。
所有可以简化为以下几点：

all_items = [] 

def parse(self, response): 
    # Get all DIV with class="action" 
    divs_action = response.xpath('//div[@class="action"]') 

    for div_action in divs_action: 
     item = VoucherItem() 

     # Get SPAN from DIV with Attribute data-voucher-id 
     span0 = div_action.xpath('./span[@data-voucher-id]')[0] 

     # Copy Attribute voucher_id 
     item['voucher_id'] = span0.xpath('./@data-voucher-id').extract()[0] 

     # Find SPAN class="code-field" inside span0 and copy Text 
     item['code'] = span0.xpath('./span[@class="code-field"]/text()').extract()[0] 

     all_items.append(item)

Output:

#CouponSpider.start_requests:https://www.cuponation.in/firstcry-coupon 
#CouponSpider.parse() 
#CouponSpider.divs_action:List[13] of <Element div at 0xf6b1c20c> 
{'voucher_id': '868600', 'code': '*******'} 
{'voucher_id': '31793', 'code': '*******'} 
{'voucher_id': '832408', 'code': '*******'} 
{'voucher_id': '819903', 'code': '*******'} 
{'voucher_id': '808774', 'code': '*******'} 
{'voucher_id': '32274', 'code': '*******'} 
{'voucher_id': '32102', 'code': '*******'} 
{'voucher_id': '844247', 'code': '*******'} 
{'voucher_id': '843513', 'code': '*******'} 
{'voucher_id': '848151', 'code': '*******'} 
{'voucher_id': '845248', 'code': '*******'} 
{'voucher_id': '869101', 'code': '*******'} 
{'voucher_id': '869328', 'code': '*******'}

来源

2017-07-14 21:34:09 stovfl

仍然没有工作 – abhi09sep

@stovfl -----我上传了我的所有代码，但仍然面临问题。 – abhi09sep

我已经做了所有的改变，但我仍然无法获得结果。 – abhi09sep

scrapy爬虫在爬行时显示错误

回答

相关问题