我保证我已经读过这个问题的其他版本,但我无法找到与我的情况相关的一个。如果有的话,我表示歉意,现在我一直盯着这几个小时。Python - 全局名称没有定义
我一直在玩这个很多,实际上在一个版本上得到了结果,所以知道它很接近。
'start_URLs'变量被定义为函数之前的列表,但由于某种原因不会在全局/模块级别上注册。
这里是确切的错误:在start_urls listing_url_list: NameError:全局名称 'start_urls' 没有定义的只是start_urls
import time
import scrapy
from scrapy.http import Request
from scrapy.selector import Selector
from scrapy.spiders import CrawlSpider, Rule
from scraper1.items import scraper1Item
from scraper1 import csvmodule
absolute_pos = './/*[@id="xpath"]/td/@class'
class spider1(CrawlSpider):
name = 'ugh'
allowed_domains = ["ugh.com"]
start_urls = [
"http://www.website.link.1",
"http://www.website.link.2",
"http://www.website.link.3"
]
def parse(self, response):
Select = Selector(response)
listing_url_list = Select.xpath('.//*[@id="xpath"]/li/div/a/@href').extract()
for listing_url_list in start_urls:
yield scrapy.Request(listing_url, callback=self.parselisting, dont_filter=True)
def parselisting(self, response):
ResultsDict = scraper1Item()
Select = Selector(response)
ResultsDict['absolute_pos'] = Select.xpath(absolute_pos).extract()
ResultsDict['listing_url'] = response.url
return ResultsDict
'self.start_urls'? –