2015-11-26 44 views
-1

我一直在寻找一个解决方案,但没有一个我发现为我工作。花了2天时间后,我已经调试过了,我应该请你们帮忙。scrapy回调函数没有输入

该网址看起来不错。即使我在请求代码之前硬编码一个url,回调函数仍然不起作用。

我的代码是:

def parse_link(self, response): 
      print 'lllll', response.url 
      print 'bbbbb', len(response.body), response.body 

    def parse(self, response): 
      hxs = HtmlXPathSelector(response) 
      issues = hxs.select('//a//@id').extract() 
      for i in range(len(issues)): 
        issue = issues[i] 
        links_2d = hxs.select('//html//body//table[%d+%d]/tr/td//a[contains(text(),"full quotes")]/@href' % (9, i)).extract() 
        links_2d = list(set(links_2d)) 

        if len(bb) < 1: continue 
        if len(links_2d) < 1: continue 

        full_link = links_2d[0] 

        yield scrapy.Request(url=full_link, callback = self.parse_link) 
+0

的allowed_domains等也没问题。 – wggbullet

+0

“链接”对象在哪里实例化? – eLRuLL

+0

对不起。这是我在发布之前清除我的代码的错字。 url = link应该是url = full_link。 – wggbullet

回答

0

试试这个:

def parse(self, response): 
     hxs = HtmlXPathSelector(response) 
     issues = hxs.select('//a//@id').extract() 
     for i in range(len(issues)): 
       issue = issues[i] 
       links_2d = hxs.select('//html//body//table[%d+%d]/tr/td//a[contains(text(),"full quotes")]/@href' % (9, i)).extract() 
       links_2d = list(set(links_2d)) 

       if len(bb) < 1: continue 
       if len(links_2d) < 1: continue 

       full_link = links_2d[0] 

       yield scrapy.Request(str(full_link), self.parse_link) 

def parse_link(self, response): 
     print 'lllll', response.url 
     print 'bbbbb', len(response.body), response.body