如何处理循环中的空白数组元素？

我有一个Ruby脚本，遍历项目列表。对于每个项目，它遍历HTML表格，收集每行的td文本并将其添加到数组中。如何处理循环中的空白数组元素？

问题是，当该表对于该特定项目为空时，它会向我的二维数组添加一个空数组，然后在尝试使用该数组将数据插入到SQL中时导致错误数据库。我怎样才能防止空数组被追加到我的数组的开始？

projects.each do |project_id| 
    url = "http://myurl.com/InventoryMaster.aspx?Qtr=%s&Client=%s" % [qtr,project_id[1]] 

    page = Nokogiri::HTML(open(url)) 
    table = page.at('my_table') 

    rows = Array.new 
    table.search('tr').each do |tr| 
    cells = Array.new 

    tr.search('td').each do |cell| 
     cells.push(cell.text.gsub(/\r\n?/, "").strip) 
    end 
    # add the project id to the cells array, and get ride of other array elements I don't need. 
    cells.insert(1, project_id[0]) 
    cells.slice!(11, 6) 
    cells.delete_at(8) 
    cells.delete_at(2) 
    cells.delete_at(0) 
    rows.push(cells) 
    end 

    # first row in the array in the html table is headers. get rid of those. 
    rows.shift 
    # last row in the html table is the footers. get rid of those too. 
    rows.pop 

    p rows 

end

这里是我解析HTML，按要求：

<table id="ctl00_MainContent_gvSearchResults" cellspacing="1" cellpadding="1" 
border="1" style="color:Black;background-color:LightGoldenrodYellow;border-color:Tan; 
border-width:1px;border-style:solid;" rules="cols"> 

<caption></caption> 
<tbody> 
    <tr style="background-color:Tan;font-weight:bold;"> 
#I don't need the headers. 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
     <th scope="col"></th> 
    </tr> 
    <tr style="font-family:arial,tahoma;font-size:Smaller;"> 
     <td>not needed</td> 
     <td>not needed</td> 
     <td>needed</td> 
     <td align="right">needed</td> 
     <td>needed</td> 
     <td>needed</td> 
     <td>needed</td> 
     <td>needed</td> 
     <td>not needed</td> 
     <td>needed</td> 

#I don't need any of the remaining td's in this row either. 
     <td align="right"></td> 
     <td align="right"></td> 
     <td align="right"></td> 
     <td align="right"></td> 
     <td align="right"></td> 
     <td></td> 
    </tr> 
#this row is the footer, and it isn't needed either. 
    <tr style="background-color:Tan;"> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
     <td></td> 
    </tr> 
</tbody>

一旦我分析的表，我需要在项目的ID，这是一部分加包含在projects数组中的键值对。

来源

2014-01-10 hyphen

显示了一些HTML，使您的问题完整。有了这个，我们可以很容易地向您展示如何正确解析，而不是在之后尝试扫描。 –

'table = page.at（'my_table'）后，如果table.children.size <= 1'（检查my_table是空白的东西），那么应该跳过空表 – bjhaid

@Tian Man - 我添加了我的html表格。我应该提到，我需要解析的最后3个td是日期，需要解析为mm-dd-yyyy。我刚刚意识到，当日期的一天部分是单个数字时，我也对此脚本有问题。 – hyphen

尝试迭代前过滤projects阵列：

projects.reject(&:empty?).each do |project_id|

现在您可以在唯一的非空数组迭代。

实施例时间：

array = [ [1], [], [2, 3] ] 
array.reject &:empty? # => [ [1], [2, 3] ]

整洁。

来源

2014-01-10 21:39:52 DiegoSalazar

这将返回：'empty？'：参数的错误数量（1代表0）（ArgumentError） – hyphen

fyi，projects数组中填充了key =>值对。不知道这与我从解决方案中获得的错误有什么关系。 – hyphen

该数组充满散列或项目是散列？前者会工作 – DiegoSalazar

您也可以使用delete_if方法：

array = [ [1], [], [2, 3] ] 
array.size # => 3 
array.delete_if &:empty? # => [ [1], [2, 3] ]  
array.size # => 2

来源

2014-01-10 21:52:07 orde

如何处理循环中的空白数组元素？

回答

相关问题