2017-04-26 15 views
-1

阅读this tutorial后,我想出了这个代码,机械化选择第一形式返回“导入错误:没有模块名为html5lib”

import requests 
    from bs4 import BeautifulSoup 
    import re 
    import mechanize 
    import cookielib 

    # Browser 
    br = mechanize.Browser() 

    # Cookie Jar 
    cj = cookielib.LWPCookieJar() 
    br.set_cookiejar(cj) 

    # Browser options 
    br.set_handle_equiv(True) 
    br.set_handle_gzip(True) 
    br.set_handle_redirect(True) 
    br.set_handle_referer(True) 
    br.set_handle_robots(False) 

    # Follows refresh 0 but not hangs on refresh > 0 
    br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1) 

    # User-Agent (this is cheating, ok?) 
    br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')] 

    # The site we will navigate into, handling it's session 
    br.open('http://www.cleanmetrics.net/foodcarbonscope') 

    br.select_form(nr=0) 
    br.form['ctl00$ContentPlaceHolder1$userName'] = "XXXXX" 
    br.form['ctl00$ContentPlaceHolder1$passWord'] = "XXXXXX" 

    # Login 
    br.submit() 

不断收到此错误:

File "scrapeRecipe.py", line 30, in <module> 
    br.select_form(nr=0) 
    File "build/bdist.macosx-10.11-intel/egg/mechanize/_mechanize.py", line 619, in select_form 
    File "build/bdist.macosx-10.11-intel/egg/mechanize/_html.py", line 260, in global_form 
    File "build/bdist.macosx-10.11-intel/egg/mechanize/_html.py", line 267, in forms 
    File "build/bdist.macosx-10.11-intel/egg/mechanize/_html.py", line 282, in _get_forms 
    File "build/bdist.macosx-10.11-intel/egg/mechanize/_html.py", line 247, in root 
    File "build/bdist.macosx-10.11-intel/egg/mechanize/_html.py", line 145, in content_parser 
ImportError: No module named html5lib 

但是,我知道我已经成功安装了html5lib,因为当我运行pip3 freeze时,我看到了

html5lib==0.999999999 
six==1.10.0 
webencodings==0.5.1 

最新: 我认为我的问题可能与我的easy-install.pth文件有关。在我的网站包目录中,我实际上没有看到html5lib。我只有这一点:

BeautifulSoup-3.2.1-py2.7.egg 
appdirs-1.4.3.dist-info 
appdirs.py 
appdirs.pyc 
beautifulsoup4-4.5.3.dist-info 
bs4 
easy-install.pth 
html2text-2016.9.19-py2.7.egg 
mechanize-0.3.1-py2.7.egg 
packaging 
packaging-16.8.dist-info 
pip-9.0.1-py2.7.egg 
requests-2.13.0-py2.7.egg 

当我跑easy_install html5lib,我得到Adding html5lib 0.999999999 to easy-install.pth file。但是,在成功完成html5lib的处理依赖关系之后,我打开了easy_install.pth文件,并且在任何地方都看不到html5lib?

import sys; sys.__plen = len(sys.path) 
    ./BeautifulSoup-3.2.1-py2.7.egg 
    ./html2text-2016.9.19-py2.7.egg 
    ./mechanize-0.3.1-py2.7.egg 
    ./requests-2.13.0-py2.7.egg 
    ./pip-9.0.1-py2.7.egg 
    import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+l en(new) 

除非html5lib位于上述软件包之一中?我想知道是否需要在我的python代码中导入html5lib,并列出根路径?

真的不知道为什么这会得到downvoted? :/

+0

您是否尝试过'''PIP安装 - 忽略安装six'''之前,这些命令?然后尝试运行您的命令。 –

+0

当我尝试运行你的命令时,我得到一个“IOError:[Errno 13] Permission denied:'/Library/Python/2.7/site-packages/six.py'” – Matt

+0

尝试运行'''pip install --ignore -installed six --user''' –

回答

-1

我现在得到了一个不同的问题,但这是html5lib的解决方案。

pip install --ignore-installed six --user 
sudo -H pip install html5lib --ignore-installed 

要了解更多信息,这是一个很好的线索:https://github.com/pypa/pip/issues/3165

+0

你不应该在PIP上做任何事情,因为它会搞乱你的全局包。您也可以在下次设置虚拟环境。 –

相关问题