2011-06-28 66 views
3

我目前正在编写一个脚本,它遍历URL列表并对它们执行一些处理。然而,我列表中的一个URL给我一个问题。代码如下:使用Ruby发出HTTP请求时发生EOF错误

url = "https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi? tmpl=attendance&event=2309515&sort=4" 
uri = URI.parse(url) 
response = Net::HTTP.get_response(uri) 

最后一行引发以下错误:

EOFError: end of file reached 
    from /usr/lib/ruby/1.8/net/protocol.rb:135:in `sysread' 
    from /usr/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill' 
    from /usr/lib/ruby/1.8/timeout.rb:67:in `timeout' 
    from /usr/lib/ruby/1.8/timeout.rb:101:in `timeout' 
    from /usr/lib/ruby/1.8/net/protocol.rb:134:in `rbuf_fill' 
    from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil' 
    from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline' 
    from /usr/lib/ruby/1.8/net/http.rb:2028:in `read_status_line' 
    from /usr/lib/ruby/1.8/net/http.rb:2017:in `read_new' 
    from /usr/lib/ruby/1.8/net/http.rb:1051:in `request' 
    from /usr/lib/ruby/1.8/net/http.rb:948:in `request_get' 
    from /usr/lib/ruby/1.8/net/http.rb:380:in `get_response' 
    from /usr/lib/ruby/1.8/net/http.rb:543:in `start' 
    from /usr/lib/ruby/1.8/net/http.rb:379:in `get_response' 
    from (irb):5 
    from /usr/lib/ruby/1.8/uri/ftp.rb:190 

在我的名单没有其他网址似乎是给我任何的悲伤。任何人都可以解释为什么我得到这个错误?

回答

6

我输入https://secure.www.alumniconnections.com/似乎将我重定向到http://www.harrisconnect.com/。我的猜测是你的代码无法处理重定向。尝试使用Mechanize(http://mechanize.rubyforge.org/)来处理这个问题。此外,我建议你换你的代码中的一些错误处理,如:

# Prevent Infinite Loops 
counter = 0 

begin 
    # Your Code Here 

rescue EOFError 
    puts "encountered EOFError" 

    # Fail the connection after 3 attempts 
    if counter < 3 
    counter += 1 
    puts "redo: #{counter}" 
    redo 
    else 
    puts "FAILED CONNECTION #{counter} TIMES" 
    counter = 0 
    end 
end 

这将尝试重新连接到了很多过去的URL时帮助我的连接。

编辑:

require 'rubygems' 
require 'mechanize' 

agent = Mechanize.new 
html_text = agent.get("https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi?tmpl=attendance&event=2309515&sort=4").body 

html_file = File.open("html_file.html", "w") 
html_file.write(html_text) 
html_file.close 

这对我这么试试看写你的网页的文件就好了。

+1

您的第一个片段可能导致无限循环吗? –

+1

是的,我相信可以。不知何故,我从来没有抓到过。我会做一个快速修改来解决,尽管我不能保证它会是最好的解决方案。 – scradge

0

如果这是HTTPS,而不仅仅是HTTP,你可以试试这个(关于Ruby 1.8.6工作):

require 'rubygems' 
require "net/https" 
require "uri" 


address = "https://www.your-secure-domain-here.com" 
uri = URI.parse(address) 
http = Net::HTTP.new(uri.host, uri.port) 
http.use_ssl = true 
http.verify_mode = OpenSSL::SSL::VERIFY_NONE 
request = Net::HTTP::Get.new(uri.request_uri) 
request.basic_auth("username", "password") 
response = http.request(request) 

在我的例子,而不是usernamepassword我不得不做SECRET-API-KEYapi_token

试试看看是否有帮助。

相关问题