2014-01-07 53 views
0

我有在Scrapy教程创建Scrapy蜘蛛问题:Scrapy关键错误

http://doc.scrapy.org/en/latest/intro/tutorial.html#our-first-spider

以下是我在我的蜘蛛/ dmoz_spider.py文件:

class DmozSpider(object): 
    name = "dmoz" 
    allowed_domains = ["dmoz.org"] 
    start_urls = [ 
    "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", 
    "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" 
    ] 

    @classmethod 
    def from_crawler(cls, crawler): 
    spider = crawler.spiders 
    return cls(spider) 

    def parse(self, response): 
    filename = response.url.split("/")[-2] 
    open(filename, 'wb').write(response.body) 

的好消息是我很确定蜘蛛正在创建。坏消息是我得到这个错误:

(scrapestat)unknownc8e0eb148153:tutorial christopherspears$ scrapy crawl dmoz 
Traceback (most recent call last): 
    File "/Users/christopherspears/.virtualenvs/scrapestat/bin/scrapy", line 4, in <module> 
    execute() 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 143, in execute 
    _run_print_help(parser, _run_command, cmd, args, opts) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help 
    func(*a, **kw) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command 
    cmd.run(args, opts) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/commands/crawl.py", line 48, in run 
    spider = crawler.spiders.create(spname, **opts.spargs) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/spidermanager.py", line 44, in create 
    raise KeyError("Spider not found: %s" % spider_name) 
KeyError: 'Spider not found: dmoz' 

不知道是什么问题。任何提示?

回答

1

DmozSpider应继承BaseSpider(或Spider,取决于您的scrapy版本)。所以,做一个跟随改变你的代码:

from scrapy.spider import BaseSpider 

class DmozSpider(BaseSpider): 
    ... 

我想,我自己当蜘蛛类从KeyError异常升高对象继承。

+0

太棒了!谢谢!出于好奇,我还需要from_crawler方法吗? –