2015-10-06 56 views
0

我有一个固定数量的类别的大型数据集。我最初一直将所有内容存储在哈希数组中。效果很好,但考虑到数据的大小和类别的冗余,效率不高。将散列添加到现有散列数组的r​​uby方法是什么?

我现在正在使用不同类型/类别的散列,并在每个类别中存储散列数组。

现在我的当前添加数据的方法是在将每个散列添加到类型数组之前删除每个散列的:type键。一切正常。不过,我相信有这样做的更简化的“红宝石路”:

# Very large data set with redundant types. 
gigantic_array = [ 
    { type: 'a', organization: 'acme inc', president: 'bugs bunny' }, 
    { type: 'a', organization: 'looney toons', president: 'donald' }, 
    { type: 'b', organization: 'facebook', president: 'mark' }, 
    { type: 'b', organization: 'myspace', president: 'whoknows' }, 
    { type: 'c', organization: 'walmart', president: 'wall' } 
    # multiply length by ~1000 
] 

# Still gigantic, but more efficient. 
# Stores each type as symbol. 
# Each type is an array of hashes. 
more_efficient_hash = { 
    type: { 
    a: [ 
     { organization: 'acme inc', president: 'bugs bunny' }, 
     { organization: 'looney toons', president: 'donald' } 
    ], 

    b: [ 
     { organization: 'facebook', president: 'mark' }, 
     { organization: 'myspace', president: 'whoknows' } 
    ], 

    c: [ 
     { organization: 'walmart', president: 'wall' } 
     # etc.... 
    ] 
    } 
} 

hash_to_add = { type: 'c', organization: 'target', president: 'sharp' } 

# Adds hash to array of types inside the gigantic more_efficient_hash. 
# Is there a better way? 
more_efficient_hash[:type][hash_to_add[:type].to_sym].push(hash_to_add.delete(:type)) 
+0

第二个散列效率如何? –

+1

@TheCha͢mp更正常吗? – binarymason

+0

我不知道你在问什么 –

回答

1

undur_gongor同意,一些小的数据类将是有益的,而且在你的结果:type键不添加任何值。

对于从gigantic_array开始的初始转换,您可以使用group_by轻松完成。请注意,Hash#delete返回已删除键的值,而不是散列值,所以我不确定最后一行是否按照您希望的方式工作。

> more_efficient_hash = gigantic_array.group_by {|item| item.delete(:type).to_sym} 
{ 
    a: [ 
    {:organization=>"acme inc", :president=>"bugs bunny"}, 
    {:organization=>"looney toons", :president=>"donald"} 
    ], 
    b: [ 
    {:organization=>"facebook", :president=>"mark"}, 
    {:organization=>"myspace", :president=>"whoknows"} 
    ], 
    c: [ 
    {:organization=>"walmart", :president=>"wall"} 
    ] 
} 

从这一点来说,你的最后一行很干净。由于delete具有破坏性,因此我们可以缩短一点。

> more_efficient_hash[hash_to_add.delete(:type).to_sym] << hash_to_add 
# ... 
    c: [ 
    {:organization=>"walmart", :president=>"wall"}, 
    {:organization=>"target", :president=>"sharp"} 
    ]