将参数传递给回调函数

def parse(self, response): 
    for sel in response.xpath('//tbody/tr'): 
     item = HeroItem() 
     item['hclass'] = response.request.url.split("/")[8].split('-')[-1] 
     item['server'] = response.request.url.split('/')[2].split('.')[0] 
     item['hardcore'] = len(response.request.url.split("/")[8].split('-')) == 3 
     item['seasonal'] = response.request.url.split("/")[6] == 'season' 
     item['rank'] = sel.xpath('td[@class="cell-Rank"]/text()').extract()[0].strip() 
     item['battle_tag'] = sel.xpath('td[@class="cell-BattleTag"]//a/text()').extract()[1].strip() 
     item['grift'] = sel.xpath('td[@class="cell-RiftLevel"]/text()').extract()[0].strip() 
     item['time'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip() 
     item['date'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip() 
     url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip() 

     yield Request(url, callback=self.parse_profile) 

def parse_profile(self, response): 
    sel = Selector(response) 
    item = HeroItem() 
    item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4] 
    return item

嗯，我在主分析方法中抓取整个表格，并且从该表格取得了几个字段。其中一个领域是一个网址，我想探索它来获得一大堆领域。我如何将我已经创建的ITEM对象传递给回调函数，以便最终项目保留所有字段？将参数传递给回调函数

正如在上面的代码所示，我能够保存的URL中的字段的表（目前的代码），或者只有那些（简单的写yield item）但我不仅产量将所有字段放在一起的一个对象。

我试过这个，但很明显，它不起作用。

yield Request(url, callback=self.parse_profile(item)) 

def parse_profile(self, response, item): 
    sel = Selector(response) 
    item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4] 
    return item

来源

2015-08-27 vic

尝试看看装饰，如。 http://thecodeship.com/patterns/guide-to-python-function-decorators/ – Sumido

因此，url会返回'item'中不存在的字段，并且您希望将这些字段添加到'item'并返回它？ –

你是否设法使这项工作？ – briankip

这就是你要使用meta关键字。

def parse(self, response): 
    for sel in response.xpath('//tbody/tr'): 
     item = HeroItem() 
     # Item assignment here 
     url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip() 

     yield Request(url, callback=self.parse_profile, meta={'hero_item': item}) 

def parse_profile(self, response): 
    item = response.meta.get('hero_item') 
    item['weapon'] = response.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4] 
    yield item

还要注意，这样做sel = Selector(response)是一种资源的浪费，从你刚才做了什么不同，所以我改变了它。它会自动映射到response中作为response.selector，它也有便捷的快捷方式response.xpath。

来源

2015-08-27 15:01:35 Rejected

我有Tkinter的的额外的参数传递了类似的问题，并发现此解决方案的工作（这里：http://infohost.nmt.edu/tcc/help/pubs/tkinter/web/extra-args.html），转化为你的问题：

def parse(self, response): 
    item = HeroItem() 
    [...] 
    def handler(self = self, response = response, item = item): 
     """ passing as default argument values """ 
     return self.parse_profile(response, item) 
    yield Request(url, callback=handler)

来源

2015-08-27 15:26:38 rolika

这是一个危险的建议。他正在遍历'response.xpath（'// tbody/tr'）'中找到的所有“项目”。由于Request不会在回调中提供一个项目作为参数，所以处理器方法将始终使用item作为默认值。不幸的是，在回调电话时，项目无论是*还是在收到请求时都不是。您收集的数据将不可靠且不一致。 – Rejected

@Rejected否，通过分配函数头中的变量（self = self ...），它保存执行'handler'函数定义时变量的值。只要'handler'的定义在循环内部，'parse_profile'就会得到每个迭代项的值。 –

这是一个很好的优雅解决方案。 –

将参数传递给回调函数

回答

相关问题