2013-01-20 58 views
1

这是错误消息:误差scrapy履带

2013-01-20 22:45:02+0700 [scrapy] INFO: Scrapy 0.16.3 started (bot: scrapybot) 

2013-01-20 22:45:02+0700 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState 

2013-01-20 22:45:02+0700 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, RedirectMiddleware, CookiesMiddleware, HttpCompressionMiddleware, ChunkedTransferMiddleware, DownloaderStats 

2013-01-20 22:45:02+0700 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware 

2013-01-20 22:45:02+0700 [scrapy] DEBUG: Enabled item pipelines: 

2013-01-20 22:45:02+0700 [test] INFO: Spider opened 

2013-01-20 22:45:02+0700 [test] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 

2013-01-20 22:45:02+0700 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023 

2013-01-20 22:45:02+0700 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080 

2013-01-20 22:45:07+0700 [test] DEBUG: Crawled (200) <GET https://api.instagram.com/v1/tags/finnishgirl/media/recent?client_id=b59fbe4563944b6c88cced13495c0f49&callback=jQuery15208793520946055651_1358691536717&_=1358691537498> (referer: None) 

2013-01-20 22:45:07+0700 [scrapy] INFO: 18 

2013-01-20 22:45:07+0700 [scrapy] INFO: 18 

2013-01-20 22:45:21+0700 [test] DEBUG: Crawled (200) <GET https://api.instagram.com/v1/tags/finnishgirl/media/recent?callback=jQuery15208793520946055651_1358691536717&_=1358691537498&client_id=b59fbe4563944b6c88cced13495c0f49&max_tag_id=1358724742769> (referer: https://api.instagram.com/v1/tags/finnishgirl/media/recent?client_id=b59fbe4563944b6c88cced13495c0f49&callback=jQuery15208793520946055651_1358691536717&_=1358691537498) 

2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [scrapy] INFO: 18 
2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] ERROR: 2013-01-20 22:45:21+0700 [-] E...... 

我不知道这种错误

。这是我的代码的任何信息。

from scrapy.contrib.spiders import CrawlSpider 
from scrapy.http import Request 
from scrapy import log 
import json 
import re 

class Spider(CrawlSpider): 
    name = "test" 
    count = 0 

    def start_requests(self): 
     return [Request('https://api.instagram.com/v1/tags/finnishgirl/media/recent?client_id=b59fbe4563944b6c88cced13495c0f49&callback=jQuery15208793520946055651_1358691536717&_=1358691537498', callback=self.parse_basic)] 

    def parse_basic(self, response): 
     if self.count == 2: 
      return 
     self.count = self.count + 1 
     log.start() 
     body = response.body 
     body = re.sub (r'jQuery[0-9_]+\(', '', body) 
     body = body[:len(body) - 1] 
     body = json.loads(body) 
     next_url = body['pagination']['next_url'] 
     count = len(body['data']) 
     log.msg(str(count), level=log.INFO) 
     f = open('test.'+str(self.count), 'w') 
     f.write(next_url) 
     f.close 
     return [Request(next_url, callback=self.parse_basic)] 

回答

0

我发现因为log.start()的方法parse_basic内,其产量在return语句送回解析方法,然后log.start(请求我的错误 这)再次启动=>这会导致错误发生