NameError：名称 '的htmlText' 没有定义

我得到一个错误，当我运行此脚本：NameError：名称 '的htmlText' 没有定义

import urllib.request 
import urllib.parse 
from bs4 import BeautifulSoup 

url = "http://nytimes.com,http://nytimes.com" 

urls = [url] #stack of urls to scrape 
visited = [url] #historic record of urls 

while len(urls) >0: 
try: 
    htmltext = urllib.request.urlopen(urls[0]).read() 
except: 
    print(htmltext)

原素文字：

import urllib.request 
import urllib.parse 
from bs4 import BeautifulSoup 

url = "http://nytimes.com,http://nytimes.com" 

urls = [url] #stack of urls to scrape 
visited = [url] #historic record of urls 

while len(urls) >0: 
try: 
    htmltext = urllib.request.urlopen(urls[0]).read() 
except: 
    print(urls[0]) 
soup = BeautifulSoup(htmltext) 

urls.pop(0) 

print (soup.findAll('a',href=True))

错误：

socket.gaierror: [Errno -2] Name or service not known

urllib.error.URLError: urlopen error [Errno -2] Name or service not known

Traceback (most recent call last):

NameError: name 'htmltext' is not defined

来源

2014-10-26 gaia

那么如果你把'http：//nytimes.com,http：// nytimes.com'放到你的浏览器地址栏中会发生什么？此外，您的标题与描述不匹配（但*当然*'htmltext'没有在'except'情况下定义 - 您在那里是因为任务*失败*）。 – jonrsharpe 2014-10-26 18:53:40

我不知道它如何可能，但现在工作，对不起 – gaia 2014-10-26 19:06:52

我明白为什么它的工作原理，我从“url”值中删除了第二个地址，在连接请求期间可能发生冲突，因为它被加倍了？ – gaia 2014-10-26 20:13:36

如果urllib.request.urlopen()引发一个例外，htmltext永远不会被分配一个值（所以打印该值在except不会窝RK）。

至于为什么urlopen()无法正常工作，请确保您传递的是有效的网址。

来源

2014-10-26 19:03:48 NPE

非常感谢！只有现在我明白了“尝试”和“除了”的含义：D – gaia 2014-10-26 19:15:17

NameError：名称 '的htmlText' 没有定义

回答

相关问题