NoMethodError从机械化

运行此代码mecahnize 2.7.3和红宝石2.3.0dev：NoMethodError从机械化

require 'mechanize' 
agent = Mechanize.new 

agent.keep_alive = false 
agent.open_timeout = 2 
agent.read_timeout = 2 
agent.ignore_bad_chunking = true 
agent.gzip_enabled = false 

url = 'http:%5C%5Cwww.scouts.org.uk' 

agent.head(url)

使我这个NoMethodError：

~/.rvm/gems/ruby-head/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:648:in resolve': undefined  
methodlength' for nil:NilClass (NoMethodError) 

from ~/.rvm/gems/ruby-head/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:223:in `fetch' 
from ~/.rvm/gems/ruby-head/gems/mechanize-2.7.3/lib/mechanize.rb:459:in `head

这是机械化的错误还是我做的有问题？如果是这样如何解决？

编辑：网址显然worng，但即时通讯从一个文件中读取很多网址，其中一些可能是错误的。

EDIT2：可以说我有这样的http://pastie.org/9934756 我需要得到所有的都是正确的URL的头而忽略其他

来源

2015-02-10 user1759796

在与超时无变化10或20 – user1759796 2015-02-10 11:34:44

你写了一个错误的URL文件，试试这个：url = 'http://scouts.org.uk'

来源

2015-02-10 10:31:45

我知道。但是有很多网址，其中一些可能是错误的。不应该错误是像404没有找到或某些东西？ – user1759796 2015-02-10 10:42:46

@ user1759796您在“％5C％5C”中的错误 - 这是错误的网址，它看起来像：“http：// google.com/”，“http：// scouts.org.uk”等（没有空格） – 2015-02-10 10:48:05

看我的编辑。我知道该网址是错误的，我只需要正确处理它 – user1759796 2015-02-10 11:38:06

您的目标网站正在进行重定向并使用元刷新。更新您的代码，包括那些方法：

require 'mechanize' 

agent = Mechanize.new 
agent.keep_alive = false 
agent.follow_meta_refresh = true 
agent.redirect_ok = true 
agent.open_timeout = 10 
agent.read_timeout = 10 
agent.ignore_bad_chunking = true 
agent.gzip_enabled = false 

url = 'http:%5C%5Cwww.scouts.org.uk' 

begin 
    page_head = agent.head(url) 
rescue Exception => exception 
    puts "Caught exception: #{exception.message}" 
end

结果：

=> #Caught exception: undefined method `length' for nil:NilClass

来源

2015-02-10 11:28:03 JonB

这并没有改变任何东西。您使用了正确的网址（不含％5c）。如果发生这种情况，我需要获得一些我可以捕获的错误，而不是一个nomethoderror。问题是我不知道是否所有的URL都有正确的格式 – user1759796 2015-02-10 11:36:33

更新了代码来捕捉异常。你如何处理它取决于你，我只是举了一个基本的例子。更多关于[Ruby Exceptions]（http://ruby-doc.org/core-1.9.3/Exception.html）和[异常处理]（http://rubylearning.com/satishtalim/ruby_exceptions.html）。 – JonB 2015-02-10 12:36:27

您可能还想查看[这篇文章]（http://stackoverflow.com/questions/1805761/check-if-url-is-valid-ruby）。 – JonB 2015-02-10 12:47:28

您可以添加此方法来检查有效的URL或不：

require 'uri' 
def valid?(url) 
    uri = URI.parse(url) 
    if uri.kind_of?(URI::HTTP) == true 
     puts '+' 
    else 
     puts '-' 
    end 
rescue URI::InvalidURIError 
    puts 'false ' 
end 

['http://web.de', 
'http://web.de/', 
'http:%5c%5cweb.de', 
'http:web.de', 
'foo://web.de', 
'http://we b.de', 
'http://|web.de'].each { |i| 
    valid?(i) 
}

+

+

+

+

false

false

来源

2015-02-10 12:46:51

对于OP提供的url，这将返回“true”，但不是。 – JonB 2015-02-10 12:54:36

@JonB现在检查它 – 2015-02-10 12:59:33

是的，他们返回true，其中一些将在浏览器中工作，但机械化仍然不会加载它们。 – JonB 2015-02-10 13:03:49

NoMethodError从机械化

回答

相关问题