在使用python进行Web抓取时出现错误

我试过很多次了。

但我只能看到Traceback。

请帮帮我。

我写这样的代码：

import re 
import urllib.request 
import urllib 
import requests 
from bs4 import BeautifulSoup 

url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1' 
html=request.get(url) 
#print(html.text) 
a=html.text 
bs=BeautifulSoup(a,'html.parser') 
print(bs.prettify()) 
bs.find('span',class="u_cbox_contents")

当我运行此：bs.find（ '跨越' 的class = “u_cbox_contents”）

我只能看到许多错误

错误是这样的。

SyntaxError: invalid syntax

如何修复代码以运行良好？

请帮帮我。

我运行这个python 3.4.4版本，windows 8.1 64x

感谢您的阅读。

来源

2016-06-30 L.kyunam

永远，永远，永远，永远使用'urllib'时，你可以使用'requests'代替。 –

@AkshatMahajan你的意思是试过这段代码？：进口重新进口urllib.request里从BS4进口BeautifulSoup URL =“HTTP：//news.naver.com/main/ranking/read.nhn中旬=等与SID 1 = 111＆rankingType = popular_week＆OID = 277＆援助= 0003773756＆日期= 20160622＆type = 1＆rankingSectionId = 102＆rankingSeq = 1＆m_view = 1' html = urllib.request.urlopen（url）但无效。我可以看到相同的错误 –

不，我的意思是你正在使用'urllib'库而不是'requests'库进行请求。 'request'只是更容易处理。做'html = requests.get（url）'。 –

继@AkshatMahajan建议，可以使用requests模块来完成以下操作。另外，您还可以修改最后一行来查找所需的元素。

##import re 
##import urllib.request 
##import urllib 
import requests 
from bs4 import BeautifulSoup 

url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1' 
html=requests.get(url) 
#print(html.text) 
a=html.text 
bs=BeautifulSoup(a,'html.parser') 
print(bs.prettify()) 
print(bs.find('span',attrs={"class" : "u_cbox_contents"}))

感谢@DiogoMartins您指出正确的Python版本以及

来源

2016-06-30 04:26:23 shaojl7

你刚刚复制了@akshat在评论中给出的答案吗？ –

@DiogoMartins是的，我把@akshat建议在评论中改为请求。并且更改了最后一行，因为原始代码行'bs.find（'span'，class =“u_cbox_contents”）'中有无效的语法错误。希望这也有助于 – shaojl7

正确的做法是在你的回答中给予@akshat功劳。此外，您的答案的最后一行将导致一个SyntaxError，因为问题说，他运行在python 3.4 –

在使用python进行Web抓取时出现错误

回答

相关问题