2013-07-20 145 views
14

我在mac 10.7.5上使用python 2,7.5,beautifulsoup 4.2.1。我将使用lxml库解析xml页面,正如beautifulsoup教程中所教导的。然而,当我跑我的代码,它显示如何重新安装lxml?

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: 
lxml,xml. Do you need to install a parser library? 

我相信,我已经通过所有方法安装LXML:easy_install的,画中画,端口等我试着添加一行到我的代码,看看是否LXML是否安装:

import lxml 

然后,python可以成功通过此代码并再次显示以前的错误消息,发生在同一行。

所以我很确定安装了lxml,但没有正确安装。所以我决定卸载lxml,然后使用'正确'的方法重新安装。但是,当我在

easy_install -m lxml 

键入它表明:

Searching for lxml 
Best match: lxml 3.2.1 
Processing lxml-3.2.1-py2.7-macosx-10.6-intel.egg 

Using /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/lxml- 
3.2.1-py2.7-macosx-10.6-intel.egg 

Because this distribution was installed --multi-version, before you can 
import modules from this package in an application, you will need to 
'import pkg_resources' and then use a 'require()' call similar to one of 
these examples, in order to select the desired version: 

pkg_resources.require("lxml") # latest installed version 
pkg_resources.require("lxml==3.2.1") # this exact version 
pkg_resources.require("lxml>=3.2.1") # this version or higher 

Processing dependencies for lxml 
Finished processing dependencies for lxml 

所以我不知道如何继续我的卸载......

我抬头一看很多帖子关于谷歌这个问题但仍然找不到任何有用的信息。

这里是我的代码:

import mechanize 
from bs4 import BeautifulSoup 
import lxml 

class count: 
    def __init__(self,protein): 
     self.proteinCode = protein 
     self.br = mechanize.Browser() 

    def first_search(self): 
     #Test 0 
     soup = BeautifulSoup(self.br.open("http://www.ncbi.nlm.nih.gov/protein/21225921?report=genbank&log$=prottop&blast_rank=1&RID=YGJHMSET015"), ['lxml','xml']) 
     return 

if __name__=='__main__': 
    proteinCode = sys.argv[1] 
    gogogo = count(proteinCode) 

我想知道:

  1. 如何卸载LXML?
  2. 如何正确安装lxml?我如何知道它已正确安装?

回答

12

我使用的是BeautifulSoup 4.3.2和OS X 10.6.8。我也有一个安装不当的问题lxml。这里有一些东西,我发现:

首先,检查此相关的问题:Removed MacPorts, now Python is broken

现在,为了检查所安装的建设者BeautifulSoup 4,尽量

>>> import bs4 
>>> bs4.builder.builder_registry.builders 

如果你没有看到你最喜欢的构建器,那么它没有安装,你会看到上面的错误(“无法找到树生成器...”)。

另外,仅仅因为你可以import lxml,并不意味着一切都是完美的。

尝试

>>> import lxml 
>>> import lxml.etree 

要理解这是怎么回事,到bs4安装并打开鸡蛋(tar -xvzf)。注意模块bs4.builder。在它里面你应该看到诸如_lxml.py_html5lib.py的文件。所以你也可以试试

>>> import bs4.builder.htmlparser 
>>> import bs4.builder._lxml 
>>> import bs4.builder._html5lib 

如果有问题,你会看到,为什么无法加载parricular模块。你可以看到如何在builder/__init__.py末加载所有这些模块和忽略任何未加载:

# Builders are registered in reverse order of priority, so that custom 
# builder registrations will take precedence. In general, we want lxml 
# to take precedence over html5lib, because it's faster. And we only 
# want to use HTMLParser as a last result. 
from . import _htmlparser 
register_treebuilders_from(_htmlparser) 
try: 
    from . import _html5lib 
    register_treebuilders_from(_html5lib) 
except ImportError: 
    # They don't have html5lib installed. 
    pass 
try: 
    from . import _lxml 
    register_treebuilders_from(_lxml) 
except ImportError: 
    # They don't have lxml installed. 
    pass 
+1

在相关问题的建议(http://stackoverflow.com/questions/14153221/removed-macports-now-python-is-broken)卸载并重新安装解决了我的问题。 –

+2

由于'lxml'在我的机器上缺失,执行'sudo pip install lxml'解决了我的问题。 –

+1

此外,安装lxml时,此步骤可能也是必要的:http://stackoverflow.com/questions/19548011/cannot-install-lxml-on-mac-os-x-10-9 – taylorc93

0

易于得到在Debian/Ubuntu的: sudo apt-get install python3-lxml 适用于MacOS-X,LXML的macport可用。尝试类似 sudo port install py27-lxml

http://lxml.de/installation.html可能会对您有所帮助。

+0

这不显示如何正确卸载它。 –

3

如果您在Ubuntu/Debian的使用Python2.7,这个工作对我来说:

$ sudo apt-get build-dep python-lxml 
$ sudo pip install lxml 

测试它想:

[email protected]:~/computer_vision/image_retrieval$ python 
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import lxml 
1

FWIW,我遇到了类似的问题(蟒蛇3.6 ,OS X 10.12.6),并能够简单地通过做来解决这个问题(第一个命令就是,以表示我是在一个畅达的virtualenv)工作:

$ source activate ml-general 
$ pip uninstall lxml 
$ pip install lxml 

我首先尝试了更复杂的事情,因为BeautifulSoup通过Jupyter + iPython正确使用相同的命令,但不通过PyCharm的终端在相同的virtualenv中。如上所述重新安装lxml解决了问题。