2015-11-29 23 views
1

您好我有几个脚本将xlsx文件转换为选项卡分隔文件,然后删除任何逗号,重复,然后用逗号分割它。 (我这样做,以确保用户没有把任何逗号逗号) 然后我做一些东西。然后将其转换回xlsx文件。这一直运行良好。但是,不是一直打开和关闭文件,我认为我会将文件推送到数组,然后在最后将其转换为xlsx。不幸的是,当我尝试转换回xlsx文件时,它正在名称之间的空格中创建一个换行符。如果我输出到一个csv文件,然后打开它并转换为一个xlsx文件,它工作正常。从csv文件perl阵列创建意想不到的地方换行

#!/usr/bin/perl 
use strict; 
use warnings; 
use Spreadsheet::BasicRead; 
use Excel::Writer::XLSX; 
local $" = "'\n'";  
open(STDERR, ">&STDOUT"); 
#covert to csv 
my $xlsx_WSD = ("C:\\Temp\\testing_file.xlsx"),, 1; 
my @csvtemp; 

     if (-e $xlsx_WSD) { 
my $ss = new Spreadsheet::BasicRead($xlsx_WSD) or die; 
    my $col = ''; 
    my $row = 0; 
    while (my $data = $ss->getNextRow()) { 
     $row++; 
     $col= join("\t", @$data); 
      push @csvtemp, $col . "\n" if ($col ne ""); 
    } 
} 
     else { 
      print " C:\\Temp\\testing_file.xlsx file EXISTS ...!!\n"; 
      print " please investigate and use the restore option if required !..\n"; 
    exit; 
} 
; 
my @arraynew; 
my %seen; 
our $Header_row = shift (@csvtemp); 
    foreach (@csvtemp){ 
chomp; 
    $_ =~ s/,//g;          
    $_ =~ s/\t/,/g;          

     # print $_ . "\n" if !$seen{$_}++ ; 
      push @arraynew, $_ . "\n" if !$seen{$_}++ ; #remove any dupes 

} 


#covert back to xlsx 
my $workbook = Excel::Writer::XLSX->new("C:\\Temp\\testing_filet.xlsx"); 
my $worksheet = $workbook->add_worksheet(); 

my ($x, $y) = (0, 0); 
    while (<@arraynew>) { 




my @list = split /,/; 
     foreach my $c (@list) { 
         $worksheet->write($x, $y++, $c); 
    } 
         $x++; 
         $y = 0; 
} 



__DATA__ 

Animal keeper M/F Years START DATE FRH FSM 
GIRAFFE JAMES LE M 5 10/12/2007  Y 
HIPPO JACKIE LEAN F 6 11/12/2007  Y 
ZEBRA JAMES LEHERN M 7 12/12/2007  Y 
GIRAFFE AMIE CAHORT M 5 13/12/2012  Y 
GIRAFFE MICKY JAMES M 5 14/06/2007  Y 
MEERKAT JOHN JONES M 9 15/12/2007 v v 
LEOPPARD JIM LEE M 8 16/12/2002  


unexpected result 

GIRAFFE JAMES    
LE M 5 10/12/2007  Y 
" 
HIPPO" JACKIE    
LEAN F 6 11/12/2007  Y 
" 
ZEBRA" JAMES    
LEHERN M 7 12/12/2007  Y 
" 
GIRAFFE" AMIE     
CAHORT M 5 13/12/2012  Y 
" 
GIRAFFE" MICKY    
JAMES M 5 14/06/2007  Y 
" 
MEERKAT" JOHN     
JONES M 9 15/12/2007 v v 
" 
LEOPPARD" JIM    
LEE M 8 16/12/2002 

回答

1

既然你在Windows上运行这个,你有没有考虑过使用Win32 :: OLE呢?

use strict; 

use Win32::OLE; 

my $app = Win32::OLE->GetActiveObject('Excel.Application') 
     || Win32::OLE->new('Excel.Application', 'Quit'); 

my $wb = $app->Workbooks->Open("C:/Temp/testing_file.xlsx"); 

my $ws = $wb->ActiveSheet; 

my $max_row = $ws->UsedRange->Rows->Count; 
my $max_col = $ws->UsedRange->Columns->Count; 

my ($row, %already) = (1); 
while ($row <= $max_row) { 

    my ($col, @output) = (1); 

    while ($col <= $max_col) { 
    my $val = $ws->Cells($row, $col)->{Text}; 

    if ($val =~ /[,\t]/) { 
     $val =~ tr/,//d; 
     $val =~ tr/\t/,/; 
     $ws->Cells($row, $col)->{Value} = $val; 
    } 
    @output[$col - 1] = $val; 
    $col++; 
    } 

    if ($already{join "|", @output}++) { 
    $ws->Rows($row)->EntireRow->Delete; 
    $max_row--; 
    } else { 
    $row++;  
    } 
} 

$wb->SaveAs("C:\\temp\\testing_filet.xlsx"); 
0

这是行尾字符的问题。

标记行尾有三种约定:Unix上为\n,Windows上为\r\n,Mac上为\r。它看起来好像您的脚本采用Mac约定,而输入和输出使用Windows约定。

因此,读取输入后,除第一行之外的所有行上都会显示前导\n。只要输出行在使用\r之前输出行也是这种情况,那么您最终会得到一个输出文件,其中包含完美的\r\n行限制行。显然,最好让脚本警惕输入所使用的行结束约定,并确保它使用相同的方式来分割行和组合输出。