2016-02-24 46 views
0

使用Ruby自动化远程数据库中的MySQL查询,我希望根据下面找到的month查询的值拆分行。使用Ruby分割MySQL查询的行并写入CSV文件

这是针对所有基于开始日期的客户在2014年6月份生成的每周(周三到下列Tueday)报告。虽然报告中没有其他内容会发生变化,但是行的重复取决于该起始日期(在下面的case声明中进行了说明)。

请注意这里使用的mysql2watircsv宝石。

简化代码:

#!/usr/local/bin/ruby 
require "mysql2" 
require "watir" 
require "csv" 

puts "Initializing Report" 

Mysql2::Client.default_query_options.merge!(:as => :array) 

mysql = Mysql2::Client.new(:host => "1.2.3.4", :username => "user", :pass => "password", :database => "db") 

puts "Successfully accessed db" 

month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m') FROM db.table WHERE db.start.group = 1;") 

day = mysql.query("SELECT DATE_FORMAT(db.table.start, '%d') FROM db.table WHERE db.start.group = 1;") 

report = mysql.query("SELECT db.table.client, SELECT DATE_FORMAT(db.table.start, '%m/%d/%Y'), SELECT DATE_FORMAT(db.table.end, '%m/%d/%Y') FROM db.table WHERE db.start.group = 1;") 

case month 
when 5 
    # code splitting one row into four 
when 6 
    if day <= 4 
    # code splitting one row into four using weekOf 
    elsif day >= 11 and day <= 17 
    # code splitting one row into three using weekOf 
    elsif day >= 18 and day <= 24 
    # code splitting one row into two using weekOf 
    else 
    # no splitting; only one row using weekOf 
    end 
end 

CSV.open("Report.csv", "wb") do |csv| 
    csv << ["Week of", "Client", "Start Date", "End Date"] 
    weekOf.zip(report).each {|row| csv << row.flatten} 
end 

puts "Results can be found in Report.csv" 

电流输出(如果我注释掉case声明,去掉"Week of",在CSV头和只写report查询到CSV):

Client, Start Date, End Date 
companyrecordlabel, 05/20/2014, 07/09/2015 
beeUrself, 05/27/2014, 02/01/2016 
overflowStack, 06/04/2014, 12/11/2015 
chapoChaps, 06/11/2014, 01/16/2016 
Meds4U, 06/18/2014, NULL 
    . 
    . 
    . 

我希望以下输出:

Week of, Client, Start Date, End Date 
06/04/2014, companyrecordlabel, 05/20/2014, 07/09/2015 
06/11/2014, companyrecordlabel, 05/20/2014, 07/09/2015 
06/18/2014, companyrecordlabel, 05/20/2014, 07/09/2015 
06/25/2014, companyrecordlabel, 05/20/2014, 07/09/2015 
06/04/2014, beeUrself, 05/27/2014, 02/01/2016 
06/11/2014, beeUrself, 05/27/2014, 02/01/2016 
06/18/2014, beeUrself, 05/27/2014, 02/01/2016 
06/25/2014, beeUrself, 05/27/2014, 02/01/2016 
06/04/2014, overflowStack, 06/04/2014, 12/11/2015 
06/11/2014, overflowStack, 06/04/2014, 12/11/2015 
06/18/2014, overflowStack, 06/04/2014, 12/11/2015 
06/25/2014, overflowStack, 06/04/2014, 12/11/2015 
06/11/2014, chapoChaps, 06/11/2014, 01/16/2016 
06/18/2014, chapoChaps, 06/11/2014, 01/16/2016 
06/25/2014, chapoChaps, 06/11/2014, 01/16/2016 
06/18/2014, Meds4U, 06/18/2014, NULL 
06/25/2014, Meds4U, 06/18/2014, NULL 
    . 
    . 
    . 

为了清楚起见:"Client"companyrecordlabel有四行,因为它的"Start Date"在五月份,而"Client"Meds4U只分成两行,因为它的"Start Date"是在六月十八号。

回答

0

我构建了以下答案FULL代码基于以下几个假设:

  • 没有DATE_FORMAT(db.table.end, '%m') = 6
  • 你希望上市是在他们被发现在订单中所有的 公司(即 db.table.id
  • 查询时间对您而言并不是一个巨大的问题
  • 您想要但不能忘记包含一个arra y named weekOf

您在查询中似乎也有太多次SELECT这个词。即使对于小如您所提供的样品,你可能要分开,并避免将它全部在一行的查询:

month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m') 
    FROM db.table 
    WHERE db.start.group = 1;") 

代替:

month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m') FROM db.table WHERE db.start.group = 1;")

而现在为代码本身:

#!/usr/local/bin/ruby 
require "mysql2" 
require "watir" 
require "csv" 

puts "Initializing Report" 

Mysql2::Client.default_query_options.merge!(:as => :array) 

mysql = Mysql2::Client.new(:host => "1.2.3.4", :username => "user", :pass => "password", :database => "db") 

puts "Successfully accessed db" 

date = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m'), 
    DATE_FORMAT(db.table.start, '%d') 
    FROM db.table 
    WHERE db.start.group = 1;") 

report = mysql.query("SELECT c, s, e FROM (SELECT * FROM (SELECT db.table.id 
    db.table.client AS c, 
    DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s, 
    DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e 
    FROM db.table 
    WHERE db.start.group = 1 
    UNION ALL 
    SELECT db.table.id 
    db.table.client AS c, 
    DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s, 
    DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e 
    FROM db.table 
    WHERE db.start.group = 1 
    HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 4)) 
    UNION ALL 
    SELECT db.table.id 
    db.table.client AS c, 
    DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s, 
    DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e 
    FROM db.table 
    WHERE db.start.group = 1 
    HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 11)) 
    UNION ALL 
    SELECT db.table.id 
    db.table.client AS c, 
    DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s, 
    DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e 
    FROM db.table 
    WHERE db.start.group = 1 
    HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 18))) AS alias 
    ORDER BY db.table.id) AS alias2;") 

weekOf = [] 

date.each do |mon, day| 
    if mon === 5 
    weekOf << "06/04/2014" 
    weekOf << "06/11/2014" 
    weekOf << "06/18/2014" 
    weekOf << "06/25/2014" 
    elsif mon === 6 
    if (day.to_i <= 4) 
     weekOf << "06/04/2014" 
     weekOf << "06/11/2014" 
     weekOf << "06/18/2014" 
     weekOf << "06/25/2014" 
    elsif ((day.to_i >= 11) && (day.to_i <= 17)) 
     weekOf << "06/11/2014" 
     weekOf << "06/18/2014" 
     weekOf << "06/25/2014" 
    elsif ((day.to_i >= 18) && (day.to_i <= 24)) 
     weekOf << "06/18/2014" 
     weekOf << "06/25/2014" 
    else 
     weekOf << "06/25/2014" 
    end 
    else 
    puts "Error: #{mon} is before May" 
    end 
end 

CSV.open("Report.csv", "wb") do |csv| 
    csv << ["Week of", "Client", "Start Date", "End Date"] 
    weekOf.zip(report).each {|row| csv << row.flatten} 
end 

puts "Results can be found in Report.csv" 

的解释:

我假定查询时间对于您看到您的示例查询很小并且不包含JOIN而言不是一个大问题。如果你发现你的查询变得大于十个左右INNER JOIN(比方说,每个表有成千上万的条目),那么这可能不再是你的最佳解决方案。

此解决方案有两个零件。

第一个是使用UNION ALL重复数据库本身的行。这意味着重复整个查询并添加下面的标准来指定何时发生这种重复。 。

这也正是HAVING条款进来当使用UNION ALLHAVING必须以这种方式,而不是WHERE使用;因为后者会导致MySQL的错误。

另请注意,作为子查询结果创建的每个MySQL表必须具有别名:aliasalias2。为了ORDER BY db.table.id(关闭我的一个假设),我没有使用一个,而是使用两个嵌套查询,然后只选择我们需要的列作为下一部分。

最后,我组合了两个单独的monthday,而是将它们变成了一个date:它将在迭代时返回一个二维数组。

第二个:我创建了weekOf数组,你可能打算但忘了包括。

然后我重复了date以便将右边的"06/#{day}/2014"推入weekOf阵列。

就是这样!我希望这有帮助。