2015-10-14 25 views
0

我刚刚开始硒。硒基本开放url

我做了一个简单的python脚本,应该打开一个url并在那里打印产品的价格。

这就是:

from selenium import webdriver 
import time 
driver = webdriver.PhantomJS(executable_path='/usr/bin/phantomjs') 
url = 'http://www.stance.com/shop/product/paint-trap' 
print "Driver Made" 
driver.get(url) 
print "URL got" 
price = driver.find_element_by_xpath('//*[@id="h1--title-price"]/span[2]').text 
print price 
driver.close() 

但是,它只是打印:“驱动程序进行”,也不会打印“URL获得”也不是价格。

它似乎卡在driver.get(url),但我不知道为什么。

我想知道如何打印价格以及如何阻止driver.get(url)永久运行。

如果我使用Ctrl C中断它,我得到:

Driver Made 
^CTraceback (most recent call last): 
    File "test.py", line 6, in <module> 
    driver.get(url) 
    File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 213, in get 
    self.execute(Command.GET, {'url': url}) 
    File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 199, in execute 
    response = self.command_executor.execute(driver_command, params) 
    File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute 
    return self._request(command_info[0], url, body=data) 
    File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request 
    resp = opener.open(request, timeout=self._timeout) 
    File "/usr/local/lib/python2.7/urllib2.py", line 404, in open 
    response = self._open(req, data) 
    File "/usr/local/lib/python2.7/urllib2.py", line 422, in _open 
    '_open', req) 
    File "/usr/local/lib/python2.7/urllib2.py", line 382, in _call_chain 
    result = func(*args) 
    File "/usr/local/lib/python2.7/urllib2.py", line 1214, in http_open 
    return self.do_open(httplib.HTTPConnection, req) 
    File "/usr/local/lib/python2.7/urllib2.py", line 1187, in do_open 
    r = h.getresponse(buffering=True) 
    File "/usr/local/lib/python2.7/httplib.py", line 1045, in getresponse 
    response.begin() 
    File "/usr/local/lib/python2.7/httplib.py", line 409, in begin 
    version, status, reason = self._read_status() 
    File "/usr/local/lib/python2.7/httplib.py", line 365, in _read_status 
    line = self.fp.readline(_MAXLINE + 1) 
    File "/usr/local/lib/python2.7/socket.py", line 476, in readline 
    data = self._sock.recv(self._rbufsize) 
KeyboardInterrupt 
+0

您正在打印它 - > 7行 - >打印“URL得到了” –

+0

“得到的URL”永远不会打印,因为get(url)永远运行。我想知道为什么get(url)是永久运行的,以及如何阻止它。 @ShubhamJain – Rorschach

+0

'$ 14.00'是我得到的结果。代码正常工作我猜 – vks

回答

1

工作代码prints-(在Windows)。

Driver Made 
URL got 
$14.00 

工作代码如下

from selenium import webdriver 
import time 
driver = webdriver.PhantomJS(executable_path=r"C:\Users\Desktop\phantomjs.exe") 

driver.set_window_size(1120, 550) 
url = 'http://www.stance.com/shop/product/paint-trap' 
print "Driver Made" 
driver.get(url) 
print "URL got" 
driver.implicitly_wait(5) 
price = driver.find_elements_by_xpath("(//*[@id='h1--title-price']/span)[2]") 
for i in price: 
    print i.text 
driver.close() 

N.B.确保幻影可执行文件路径和硒库在其他软件中是正确的。

+0

可悲的是,我仍然永远地运行着这个。它必须是一些错误,不必与代码,我会尝试重新安装phantomJS,如果这不起作用,我会尝试firefox – Rorschach

+0

谢谢你肯定代码的作品,我会标记你的答案为正确,因为代码有效,这就是我所要求的。我认为我的Ubuntu的运行速度非常慢。所以,我会着手解决这个问题。 – Rorschach