我是新来scrapy和运行时,蜘蛛抓取behanceScrapy - 无法导入的项目,我的蜘蛛(无模块名behance.items)
import scrapy
from scrapy.selector import Selector
from behance.items import BehanceItem
from selenium import webdriver
from scrapy.http import TextResponse
from scrapy.crawler import CrawlerProcess
class DmozSpider(scrapy.Spider):
name = "behance"
#allowed_domains = ["behance.com"]
start_urls = [
"https://www.behance.net/gallery/29535305/Mind-Your-Monsters",
]
def __init__ (self):
self.driver = webdriver.Firefox()
def parse(self, response):
self.driver.get(response.url)
response = TextResponse(url=response.url, body=self.driver.page_source, encoding='utf-8')
item = BehanceItem()
hxs = Selector(response)
item['link'] = response.xpath("//div[@class='js-project-module-image-hd project-module module image project-module-image']/@data-hd-src").extract()
yield item
process = CrawlerProcess({
'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(DmozSpider)
process.start()
我正在下面的命令错误当我运行线我的履带
回溯(最近通话最后一个): 文件 “/home/davy/behance/behance/spiders/behance_spider.py”,3号线,在 从behance.items导入BehanceItem
ImportError:个无模块命名behance.items
我的目录结构:
behance/
├── behance
│ ├── __init__.py
│ ├── items.py
│ ├── pipelines.py
│ ├── settings.py
│ └── spiders
│ ├── __init__.py
│ └── behance_spider.py
-── scrapy.cfg
什么是你的items.py文件的内容? – narko
@narko'进口scrapy 类BehanceItem(scrapy.Item): #定义字段您的项目在这里就像: #NAME = scrapy.Field() 链接= scrapy.Field()' – Davy