2012-09-28 40 views
0

我有与下面的语法的输入文件:如何从数组中拆分字符串并重复fileds?

"ID","Company Name","AccountManager","Product","Support Type","Country" 

实施例:

"1","Company one","Surname Name/Phone/ Cell Phone ","Product► (d2XXXXXX) ► Version","29.10.2012 ► Type of support","Singapore" 

"2","Company two","Surname Name/Phone/ Cell Phone ","Product► (d2XXXXXX) ► Version\nProduct► (d2XXXXXX) ► Version\nProduct► (d2XXXXXX) ► Version","31.10.2012 ► Type of support\n28.10.2012 ► Type of support\nn/a ► Type of support","Indonesia" 

"3","Company three","Surname Name/Phone/ Cell Phone ","Product► (d2XXXXXX) ► Version\nProduct► (d2XXXXXX) ► Version\nProduct► (d2XXXXXX) ► Version\nProduct► (d2XXXXXX) ► Version\nProduct► (d2XXXXXX) ► Version\nProduct► (d2XXXXXX) ► Version,"31.12.2012 ► Type of support\nType of support\nn\\a ► Type of support\n31.03.2013 ► Type of support\n25.10.2012 ► Type of support\nn\\a ► Type of support","USA" 

第一公司只有一个产品,第二公司拥有3个产品 - 它们与\n(产品分离和支持类型),第三家公司有6个产品。

在输出该字符串必须是独立的和重复列的值:

"ID","Company Name","AccountManager","Country",但"AccountManager"应该只有姓和名,和列支持类型应与今天的日期进行比较 - 如果日期支持类型与今天的日期不同,值在27到32天之间,这些列必须在输出文件中。如果支持类型中有n/a,应该错过。

输出应该是这样的:

"1","Company one","Surname Name","Product► (d2XXXXXX) ► Version","29.10.2012","Singapore" 
"2","Company two","Surname Name","Product► (d2XXXXXX) ► Version","28.10.2012","Indonesia" 
"2","Company two","Surname Name","Product► (d2XXXXXX) ► Version","31.10.2012","Indonesia" 
"3","Company three","Surname Name","Product► (d2XXXXXX) ► Version","25.10.2012","USA" 

我怎样才能做到这在bash?

+0

欢迎来到Stack Overflow!我们鼓励你[研究你的问题](http://stackoverflow.com/questions/how-to-ask)。如果你已经[尝试了某些东西](http://whathaveyoutried.com/),请将其添加到问题中 - 如果没有,请先研究并尝试您的问题,然后再回来。 – 2012-09-28 06:22:51

回答

2

您可以使用一个名为“products.awk”下面的AWK脚本得到它:

#/usr/bin/awk -f 

BEGIN { 
    FS=","; 
    "date +\"%s\"" | getline curr_timestamp; 
} 

{ 
    split($3, account, "/"); 
    gsub(/ $/, "", account[1]); 
    split($4, products, "\\\\n"); 
    split($5, supports, "\\\\n"); 
    for (i in products) { 
     gsub("\"", "", products[i]); 
     gsub("\"", "", supports[i]); 
     split(supports[i], timesupport, " "); 
     # ignore not available and support without date 
     if (supports[i] !~ /n\\\\a*/ && supports[i] !~ /n\/a*/ && $2 !~ /\NULL/ && timesupport[1] ~ /[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9][0-9]/) { 
      # formatting date 
      split(timesupport[1], date, "\."); 
      mydate = "date -d \""date[3]"/"date[2]"/"date[1]"\" \"+%s\""; 
      # date to timestamp (using bash) 
      mydate | getline timestamp; 
      # timestamp is >= 27 days and <= 32 days 
      if ((timestamp-curr_timestamp) >= 2332800 && (timestamp-curr_timestamp) <= 2764800) 
       print $1","$2","account[1]"\",\""products[i]"\",\""supports[i]"\","$6; 
     } 
    } 
} 

假设你的数据是在一个名为data.txt中的文件,你可以调用从庆典这个脚本这一行:

awk -f products.awk data.txt 

使用您的样本文件,我得到这个输出运行脚本:

"1","Company one","Surname Name","Product► (d2XXXXXX) ► Version","29.10.2012 ► Type of support","Singapore" 
"2","Company two","Surname Name","Product► (d2XXXXXX) ► Version","31.10.2012 ► Type of support","Indonesia" 
"2","Company two","Surname Name","Product► (d2XXXXXX) ► Version","28.10.2012 ► Type of support","Indonesia" 

编辑:

我只得到,因为最后一行的3线不适合> = 27 & & < = 32的条件(今天是9月29号,你的问题作出的九月28日)。

最后我们得到了它!

+0

哦,谢谢! Thaths很好!但是,你能解释我如何通过值“n/a”进行过滤,并将“AccountManager”仅限于姓和名? –

+0

只是和问题 - 如何按日期进行过滤 - 如果支持类型中的日期与今天日期中的值在27到32天之间不同,则此列必须位于输出文件中。 –

+0

我添加了行来过滤n/a值。我将做一个编辑解决Surname的东西不久 – arutaku