2012-09-12 39 views
1

我正在导入巨大的csv文件,我想分割它,导入会更快(我没有直接导入数据库,我有一些计算)。 代码看起来像这样:红宝石并行csv导入

def import_shatem 
    require 'csv' 





    CSV.foreach("/#{Rails.public_path}/uploads/hshatem2.csv", {:encoding => 'ISO-8859-15:UTF-8', :col_sep => ';', :row_sep => :auto, :headers => :first_row}) do | row | 

     @eur_cur = Currency.find_by_currency_name("EUR") 
     abrakadabra = row[0].to_s() 
     (ename,esupp) = abrakadabra.split(/_/) 
     eprice = row[6].to_f/@eur_cur.currency_value 
     eqnt = /(\d+)/.match(row[1])[0].to_f 


     if ename.present? && ename.size>3 
     search_condition = "*" + ename.upcase + "*"  

     if esupp.present? 
      #supplier = @suppliers.find{|item| item['SUP_BRAND'] =~ Regexp.new(".*#{esupp}.*") } 
      supplier = Supplier.where("SUP_BRAND like ?", "%#{esupp}%").first 
      logger.warn("!!! *** supp !!!") 

     end 

     if supplier.present? 

      @search = ArtLookup.find(:all, :conditions => ['MATCH (ARL_SEARCH_NUMBER) AGAINST(? IN BOOLEAN MODE) and ARL_KIND = 1', search_condition.gsub(/[^0-9A-Za-z]/, '')]) 
      @articles = Article.find(:all, :conditions => { :ART_ID => @search.map(&:ARL_ART_ID)}) 
      #@art_concret = @articles.find_all{|item| item.ART_ARTICLE_NR.gsub(/[^0-9A-Za-z]/, '').include?(ename.gsub(/[^0-9A-Za-z]/, '')) } 

      @aa = @articles.find{|item| item['ART_SUP_ID']==supplier.SUP_ID} #| @articles 
      if @aa.present? 
      @art = Article.find_by_ART_ID(@aa) 
      end 

      if @art.present? 
      #require 'time_diff' 
      #cur_time = Time.now.strftime('%Y-%m-%d %H:%M') 
      #time_diff_components = Time.diff(@art.datetime_of_update, Time.parse(cur_time)) 
      limit_time = Time.now + 3.hours 
      if (@art.PRICEM.to_f >= eprice.to_f || @art.PRICEM.blank?) #&& @art.datetime_of_update >= limit_time) 
       @art.PRICEM = eprice 
       @art.QUANTITYM = eqnt 
       @art.datetime_of_update = DateTime.now 
       @art.save 
      end 
      end 

     end  
     end 
    end 
    end 

我该如何平行呢?并获得更快的导入?

+1

当我有类似(百万行)我刚分手的CSV成多个文件的东西(与Unix的'split'命令)和并行开始几个进口商... –

+0

您的评论应该成为这个问题的答案。当我遇到同样的问题时,我做了完全相同的事情。 – DNNX

+0

[Speed up csv import]的可能重复(http://stackoverflow.com/questions/12166389/speed-up-csv-import) –

回答