2011-05-02 38 views
1

我使用机械化/引入nokogiri并且需要大量的这些表的解析出一个HTML:红宝石引入nokogiri解析HTML表III

<table width="100%" onclick="javascript:abredown('c7a8e8041a5031f127d5d27f3f071cbb');" class="buscaDestaque" bgcolor="#F7D36A"> 
    <tr> 
    <td rowspan="2" scope="col" style="width:5%"><img src="images/gold.gif" border="0"></td> 
    <td scope="col" style="width:45%" class="mais"><b>Community - 2nd Season</b><br />Community - 2&ordf; Temporada<br/><b>Downloads: </b> 2496 <b>Comentários: </b>17<br><b>Avaliação: </b> 10/10</td> 
    <td scope="col" style="width:20%">28/03/2011 - 21:07</td> 
    <td scope="col" style="width:20%"><a href="javascript:abreinfousuario(1083150)">SubsOTF</a></td> 
    <td scope="col" style="width:10%"><img src='images/flag_br.gif' border='0'></td> 
    </tr> 
    <tr> 
    <td colspan="4">Release: <span class="brls">Community.S02E19.HDTV.XviD-LOL/DIMENSION</span></td> 
    </tr> 
</table> 

我想这个输出

Community.S02E19.HDTV.XviD-LOL/DIMENSION, ('c7a8e8041a5031f127d5d27f3f071cbb') 

谁能帮助我?

回答

6
require 'nokogiri' 

html = Nokogiri::HTML html_with_many_tables 
results = html.css('table.buscaDestaque').map do |table| 
    jsid = table['onclick'][/'(\w+)'/,1] 
    brls = table.at_css('.brls').text 
    "#{brls}, #{jsid}" 
end 
p results 
#=>["Community.S02E19.HDTV.XviD-LOL/DIMENSION, c7a8e8041a5031f127d5d27f3f071cbb", 
#=> "AnotherBRLS, anotherJSID"] 
+0

谢谢。正是我需要的。 – Hodes 2011-05-02 04:45:07