0
您能否建议修复?它几乎从一张图像下载imgur页面中的所有图像,不知道为什么它在这种情况下不起作用以及如何修复它?bs4.FeatureNotFound:找不到具有您请求的功能的树型构建器:lxml
elif 'imgur.com' in submission.url and not (submission.url.endswith('gif')
or submission.url.endswith('webm')
or submission.url.endswith('mp4')
or 'all' in submission.url
or '#' in submission.url
or '/a/' in submission.url):
html_source = requests.get(submission.url).text # download the image's page
soup = BeautifulSoup(html_source, "lxml")
image_url = soup.select('img')[0]['src']
if image_url.startswith('//'):
image_url = 'http:' + image_url
image_id = image_url[image_url.rfind('/') + 1:image_url.rfind('.')]
try:
image_file = urllib2.urlopen(image_url, timeout = 5)
with open('/home/mona/computer_vision/image_retrieval/images/'+ category+ '/'+ 'imgur_'+ datetime.datetime.now().strftime('%y-%m-%d-%s') + image_url[-9:], 'wb') as output_image:
output_image.write(image_file.read())
except urllib2.URLError as e:
print(e)
continue
的错误是:
[LOG] Done Getting http://i.imgur.com/FoCjtI7.jpg
submission id is: 1alffm
[LOG] Getting url: http://sphotos-a.ak.fbcdn.net/hphotos-ak-ash4/217834_10151246341237704_484810759_n.jpg
HTTP Error 403: Forbidden
[LOG] Getting url: http://imgur.com/xp386
Traceback (most recent call last):
File "download_images.py", line 67, in <module>
soup = BeautifulSoup(html_source, "lxml")
File "/usr/lib/python2.7/dist-packages/bs4/__init__.py", line 155, in __init__
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser – Muposat