2014-09-23 23 views
2

我在python中为Google App Engine(GAE)使用urllib2。 很多时候是因为以下错误的应用程序崩溃:在Google App Engine中使用urllib2会引发“在等待来自URL的HTTP响应时超出截止日期:...”

Deadline exceeded while waiting for HTTP response from URL: ....

的来源是这样的:

import webapp2 
import urllib2 
from bs4 import BeautifulSoup 

def functionRunning2To5Seconds_1()  
    #Check if the Url could be parsed 
    try: 
     url   ="http://...someUrl..." 
     req   = urllib2.Request(url,headers={'User-Agent': 'Mozilla/5.0'}) 
     page  = urllib2.urlopen(req) 
     htmlSource = BeautifulSoup(page) 
    except Exception e: 
     logging.info("Error : {er}".format(er=str(e))) 

    #do some calculation with the data of htmlSource, which takes 2 To 5 Seconds 

#and the handler looks like: 
class xyHandler(webapp2.RequestHandler): 
    def post(self, uurl=None): 
     r_data1 = functionRunning2To5Seconds_1() 
     r_data2 = functionRunning2To5Seconds_2() 
     r_data3 = functionRunning2To5Seconds_3() 
     ... 
     #show the results in a web page 

我发现这个doc其中规定:

You can use the Python standard libraries urllib, urllib2 or httplib to make HTTP requests. When running in App Engine, these libraries perform HTTP requests using App Engine's URL fetch service

这:

You can set a deadline for a request, the most amount of time the service will wait for a response. By default, the deadline for a fetch is 5 seconds. The maximum deadline is 60 seconds for HTTP requests and 60 seconds for task queue and cron job requests.

那么,我该如何做到这一点?如何在urllib2上设置超时?

或者,我是否必须重写整个应用程序才能使用App Engine的URL获取服务?

(PS:有谁知道 “)r_data1 = functionRunning2To5Seconds _...(” 运行在并行调用一个安全的方式?)

回答

5

https://docs.python.org/2/library/urllib2.html

urllib2.urlopen(url[, data][, timeout]) 

The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used).

2

至于建议由保罗,你可以传递timeout参数。在App Engine上,它绑定到网址抓取并将其截止时间调整为最多60秒。请记住,如果urlopen超过timeout参数中指定的时间,将会从google.appengine.api.urlfetch_errors.DeadlineExceededError中获取DeadlineExceededError,而不是通常的socket.timeout。捕获此错误并在必要时重试/记录是一种很好的做法。有关处理DeadlineExceededError的更多信息,请参见[1]。

[1] - https://developers.google.com/appengine/articles/deadlineexceedederrors

+0

你从哪里得到10秒的限制?我能找到的只有60秒。这是特定的,还是只是一个错字? – user1911091 2014-09-24 06:57:02

+0

错字,对不起。限制为60秒/ 10分钟(任务队列)。 – 2014-09-24 07:20:33

相关问题