0
我想刮网站的所有搜索结果页,它的工作原理,但有时脚本将停止,并显示以下错误:红宝石机械化刮ResponseCodeError
502 => Net::HTTPBadGateway for https://website.com/id/12/ -- unhandled response (Mechanize::ResponseCodeError)
我想继续即使它找到脚本一个错误。
我的脚本:
require 'mechanize'
require 'csv'
a = Mechanize.new
CSV.open('datas.csv', "wb") do |csv|
page = a.get("https://website.com/?page=1-200") #498
number = 0
page.links_with(:class => "btn btn-default").each do |link|
post_link = link.href
inside_page = a.get("https://website.com#{post_link}")
title = inside_page.at("h1.serviceTitle").text.strip
author = inside_page.at(".name").text.strip
number+=1
csv << [title, author]
end
end
任何想法?
THX它的作品!我会检查你的链接 – Rubyx