bs4.FeatureNotFound：找不到具有您请求的功能的树型构建器：lxml

您能否建议修复？它几乎从一张图像下载imgur页面中的所有图像，不知道为什么它在这种情况下不起作用以及如何修复它？bs4.FeatureNotFound：找不到具有您请求的功能的树型构建器：lxml

elif 'imgur.com' in submission.url and not (submission.url.endswith('gif') 
         or submission.url.endswith('webm') 
         or submission.url.endswith('mp4') 
         or 'all' in submission.url 
         or '#' in submission.url 
         or '/a/' in submission.url): 
       html_source = requests.get(submission.url).text # download the image's page 
       soup = BeautifulSoup(html_source, "lxml") 
       image_url = soup.select('img')[0]['src'] 
       if image_url.startswith('//'): 
       image_url = 'http:' + image_url 
       image_id = image_url[image_url.rfind('/') + 1:image_url.rfind('.')] 
       try: 
       image_file = urllib2.urlopen(image_url, timeout = 5) 
       with open('/home/mona/computer_vision/image_retrieval/images/'+ category+ '/'+ 'imgur_'+ datetime.datetime.now().strftime('%y-%m-%d-%s') + image_url[-9:], 'wb') as output_image: 
         output_image.write(image_file.read()) 
         except urllib2.URLError as e: 
         print(e) 
         continue

的错误是：

[LOG] Done Getting http://i.imgur.com/FoCjtI7.jpg 
submission id is: 1alffm 
[LOG] Getting url: http://sphotos-a.ak.fbcdn.net/hphotos-ak-ash4/217834_10151246341237704_484810759_n.jpg 
HTTP Error 403: Forbidden 
[LOG] Getting url: http://imgur.com/xp386 
Traceback (most recent call last): 
    File "download_images.py", line 67, in <module> 
    soup = BeautifulSoup(html_source, "lxml") 
    File "/usr/lib/python2.7/dist-packages/bs4/__init__.py", line 155, in __init__ 
    % ",".join(features)) 
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

来源

2016-10-11 Mona Jalal

https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser – Muposat

打开蟒蛇外壳和尝试以下操作：

from bs4 import BeautifulSoup 
myHTML = "<html><head></heda><body><strong>Hi</strong></body></html>" 
soup = BeautifulSoup(myHTML, "lxml")

这是否工作，还是同样的错误？如果同样的错误，你错过了lxml。安装：

pip install lxml

我经历的步骤，因为你表明该脚本工作了好一会儿才崩溃，在这种情况下，你不能缺少的解析器？

由OP补充：

If you are using Python2.7 in Ubuntu/Debian, this worked for me: 

$ sudo apt-get build-dep python-lxml 
$ sudo pip install lxml 

Test it like: 

[email protected]:~/computer_vision/image_retrieval$ python 
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import lxml

来源

2016-10-11 21:19:13

感谢。脚本在另一台机器上工作。我错过了在这台新机器上安装lxml。 –

bs4.FeatureNotFound：找不到具有您请求的功能的树型构建器：lxml

回答

相关问题