2013-08-28 63 views
0

我有这个csv文件,我想在第20和第21个字段排序。例如,这些字段中的数据是P1,PK5。我的挑战是,当我在那些领域排序时,他们并不按照我所希望的顺序排列。似乎我必须将这些字段填充到该字段数据中最长的值。用0填充csv值

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate 
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK5","2031100094470495539729170204309","3GH000503","August 26, 2013" 
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK1","3031100094470495580529020291210","3GH000503","August 26, 2013" 
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK2","3031100094470495583729061944757","3GH000503","August 26, 2013" 
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013" 
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013" 
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013" 
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013" 

所以,从上面的数据,我需要有文件看起来像这样:

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate 
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK05","2031100094470495539729170204309","3GH000503","August 26, 2013" 
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK01","3031100094470495580529020291210","3GH000503","August 26, 2013" 
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK02","3031100094470495583729061944757","3GH000503","August 26, 2013" 
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013" 
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013" 
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013" 
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013" 

的P1领域可能是P100,所以我需要垫P1到P001。但实际上它只是需要无论最大长度。我可以对两个字段上的文件进行排序,但不知道如何填充它们。

在此先感谢您的帮助。

+2

你在什么环境?你想自己修改csv文件吗?用python,perl或类似的东西?需要更多信息! – simon

+0

要回答你的问题,我们需要了解你使用什么编程语言或工具来访问.csv。知道数据库类型(Oracle,MSSQL,mySql等)也有帮助。问:您正尝试读取现有的CSV(而不是写入或修改.csv),对吗? – paulsm4

+0

对不起,我在一个linux系统上。 SUSE。我确实想用shell脚本修改csv文件。我正在尝试修改csv以将这两个字段填充到这些字段中最长的值。 – GroveTuckey

回答

1

没关系,因为没有别的已经即将到来,这里有一个快速的Python(x或3 x)脚本,它会做你需要的东西:

import sys 
import csv 

reader = csv.reader(sys.stdin) 
writer = csv.writer(sys.stdout, quoting=csv.QUOTE_ALL) 

rows = [row for row in reader] 
max_len = max([len(row[20]) for row in rows[1:]]) 

writer.writerow(rows[0]) 
for row in rows[1:]: 
    while len(row[20]) < max_len: 
     row[20] = 'PK0' + row[20][2:] 
    writer.writerow(row) 

如果您保存此端口,比如,pad.py ,那么你可以使用它像这样:

$ cat /path/to/my_csv_file.csv | python /path/to/pad.py > /path/to/my_new_csv_file.csv 

,并会在你需要的格式创建my_new_csv_file.csv。由于脚本作用于stdin并输出到stdout,因此您可以以多种不同的方式使用它以满足您的目的。

希望这会有所帮助。