您无法访问蜘蛛实例,因为在引擎启动时完成了管道初始化。事实上,你必须认为你的管道可以处理多个蜘蛛,而不是一个蜘蛛。
话虽如此,你可以挂钩spider_opened
信号来访问蜘蛛实例启动时。
from scrapy import signals
class MyPipeline(object):
def __init__(self, mysetting):
# do stuff with the arguments...
self.mysetting = mysetting
@classmethod
def from_crawler(cls, crawler):
settings = crawler.settings
instance = cls(settings['CUSTOM_SETTINGS_VARIABLE']
crawler.signals.connect(instance.spider_opened, signal=signals.spider_opened)
return instance
def spider_opened(self, spider):
# do stuff with the spider: initialize resources, etc.
spider.log("[MyPipeline] Initializing resources for %s" % spider.name)
def process_item(self, item, spider):
return item
请更新您的答案:我还需要访问MyPipeline中的CUSTOM_SETTINGS_VARIABLE。 –
@hellomyfriends是设置模块中的设置吗?您可以通过'crawler.settings'访问设置模块。 – Rolando
是的。非常感谢 –