2013-11-15 13 views
0

访问一定值这是特定的代码位,我使用了刮:如何引入nokogiri哈希数组从一刮

require 'singleton' 
require 'open-uri' 

class ProgramHighlights < ActiveRecord::Base 

    self.table_name = 'program_highlights' 
    include ActiveRecord::Singleton 

    def fetch 
    url = "http://kboo.fm/" 
    doc = Nokogiri::HTML(open(url)) 
    titles = [] 
    program_title = doc.css(".title a").each do |title| 
     titles.push(title) 
    end 
    end 
end 

访问标题阵列,并通过它我的输出eaching当是:

(Element:0x5b40910 { 
    name = "a", 
    attributes = [ 
    #(Attr:0x5b8c310 { 
     name = "href", 
     value = "/content/thedeathsofothersthefateofciviliansinamericaswars" 
     }), 
    #(Attr:0x5b8c306 { 
     name = "title", 
     value = "The Deaths of Others: The Fate of Civilians in America's Wars" 
    })], 
    children = [ 
    #(Text "The Deaths of Others: The Fate of Civilians in America's Wars")] 
    }) 

我特别希望得到 “值” 但是做以下不拉:

titles[0].value 
titles[0]["value"] 
titles[0][value] 

我不知道为什么我不能访问它,因为它看起来像一个哈希。任何指向这个方向的指针?我无法以简单的JSON格式获取数据,因此需要进行刮擦。

+0

您能得到什么,如果你把一个'title'和调用'title.element'或'title.elements'就可以了吗? – CDub

+0

元素的未定义方法元素,元素返回空数组 – yburyug

+0

什么是'title.class'? – CDub

回答

1

要获取节点的属性值,可以使用['attribute_name']。例如:

require 'nokogiri' 
html = %Q{ 
    <html> 
     <a href="/content/thedeathsofothersthefateofciviliansinamericaswars" title="The Deaths of Others: The Fate of Civilians in America's Wars"> 
    </html> 
} 
doc = Nokogiri::HTML(html) 
node = doc.at_css('a') 
puts node['href'] 
#=> /content/thedeathsofothersthefateofciviliansinamericaswars 
puts node['title'] 
#=> The Deaths of Others: The Fate of Civilians in America's Wars 

假设你希望每个链接的标题属性值,你可以这样做:

program_title = doc.css(".title a").each do |link| 
    titles.push(link['title']) 
end 
+0

是的..这就是它应该如何工作! –