1
我想抓取这个页面的所有exibitors动态页面:使用scrapy与硒
https://greenbuildexpo.com/Attendee/Expohall/Exhibitors
但scrapy不会加载我在做什么现在用硒加载它的内容页面和搜索与scrapy链接:
url = 'https://greenbuildexpo.com/Attendee/Expohall/Exhibitors'
driver_1 = webdriver.Firefox()
driver_1.get(url)
content = driver_1.page_source
response = TextResponse(url='',body=content,encoding='utf-8')
print len(set(response.xpath('//*[contains(@href,"Attendee/")]//@href').extract()))
该网站似乎并没有做出任何新的请求时,“下一个”按钮被按下,所以我希望得到所有链接的一个,但我只是很与该代码获得43个链接。他们应该是在500左右。
现在我想按“下一步”按钮抓取网页:
for i in range(10):
xpath = '//*[@id="pagingNormalView"]/ul/li[15]'
driver_1.find_element_by_xpath(xpath).click()
,但我得到了一个错误:
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: {"method":"xpath","selector":"//*[@id=\"pagingNormalView\"]/ul/li[15]"}
Stacktrace: