红宝石机械化刮ResponseCodeError

我想刮网站的所有搜索结果页，它的工作原理，但有时脚本将停止，并显示以下错误：红宝石机械化刮ResponseCodeError

502 => Net::HTTPBadGateway for https://website.com/id/12/ -- unhandled response (Mechanize::ResponseCodeError)

我想继续即使它找到脚本一个错误。

我的脚本：

require 'mechanize' 
require 'csv' 

a = Mechanize.new 
CSV.open('datas.csv', "wb") do |csv| 
    page = a.get("https://website.com/?page=1-200") #498 
    number = 0 
    page.links_with(:class => "btn btn-default").each do |link| 
     post_link = link.href 
     inside_page = a.get("https://website.com#{post_link}") 
     title = inside_page.at("h1.serviceTitle").text.strip 
     author = inside_page.at(".name").text.strip 
     number+=1 
     csv << [title, author] 
    end 
end

任何想法？

来源

2017-10-19 Rubyx

这很容易通过适当的异常处理来解决。你可以check this page for a better explanation

为你的代码，你可以处理该异常，像这样

CSV.open('datas.csv', "wb") do |csv| 
    begin 
    a = Mechanize.new 
    page = a.get("https://website.com/?page=1-200") #498 
    number = 0 
    page.links_with(:class => "btn btn-default").each do |link| 
     post_link = link.href 
     inside_page = a.get("https://website.com#{post_link}") 
     title = inside_page.at("h1.serviceTitle").text.strip 
     author = inside_page.at(".name").text.strip 
     number+=1 
     csv << [title, author] 
    end 
    rescue => e 
    // do nothing and move on to the next line 
    end 
end

来源

2017-10-19 16:15:30 Maru

THX它的作品！我会检查你的链接 – Rubyx

红宝石机械化刮ResponseCodeError

回答

相关问题