2016-05-18 42 views
0

删除所有空列我有一个CSV文件中像这样:批次:如何从一个CSV文件

P,PC,,PL,B,15feb16,P,Bay,RP,15-FEB-16,22-FEB-16,7,,,,,,11,14,138,14,16,993.42,-12,-84,-12,,,,,,,,,17,2,-10,0,0,1,1,16:05:53,15FEB16 
P,PC,,PL,I,1FEB-16,P,In,RP,15-FEB-16,22-FEB-16,7,,,,,,25,5,32,5,5,-29.7,-24,-168,-24,,,,,,,,,520,14,-10,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 
P,PC,,PC,S,15FEB16,P,Su,RP,15-FEB-16,22-FEB-16,7,,,,,,6,5,32,56,5,4.65,0,0,0,,,,,,,,,546,0,0,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 

我写的代码是:

@echo off 
setlocal EnableDelayedExpansion 
for /F "delims=" %%a in (C:\Pca.csv) do (
    set line=%%a 
    set line=!line:,,=, ,! 
    set line=!line:,,=, ,! 
    for /F "tokens=1,2,3* delims=," %%i in (^"!line!^") do (
     echo %%i,%%l>>C:\P.csv 
    ) 
) 

但只删除第二列和第三列,不管它是空的还是包含数据。

样本输出文件应该是这样的:

P,PC,PL,B,15feb16,P,Bay,RP,15-FEB-16,22-FEB-16,7,11,14,138,14,16,993.42,-12,-84,-12,17,2,-10,0,0,1,1,16:05:53,15FEB16 
P,PC,PL,I,1FEB-16,P,In,RP,15-FEB-16,22-FEB-16,7,25,5,32,5,5,-29.7,-24,-168,-24,520,14,-10,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 
P,PC,PC,S,15FEB16,P,Su,RP,15-FEB-16,22-FEB-16,7,6,5,32,56,5,4.65,0,0,0,546,0,0,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 
+0

什么是你的代码错误?你在期待什么? – CSchulz

+0

此代码只删除第二列和第三列,不管它是空的还是包含数据@CSchulz –

+0

您的'csv'分隔得如何?用逗号,就像你的代码所建议的那样,或者用空格或制表符,你的例子如何显示? – Stephan

回答

0

这是一个非常全面和自适应的脚本,用于从CSV格式的数据中删除空行。


中所示的代码之前,让我们来看看当/?叫时显示的帮助消息:

"del-empty-cols-from-csv.bat" 

This script removes any empty columns from CSV-formatted data. A column is con- 
sidered as empty if the related fields in all rows are empty, unless the switch 
/H is given, in which case the first line (so the header) is evaluated only. 
Notice that fields containing white-spaces only are not considered as empty. 


USAGE: 

    del-empty-cols-from-csv.bat [/?] [/H] csv_in [csv_out] 

    /?  displays this help message; 
    /H  specifies to regard the header only, that is the very first row, 
      to determine which columns are considered as empty; if NOT given, 
      the whole data, hence all rows, are taken into account instead; 
    csv_in CSV data file to process, that is, to remove empty columns of; 
      these data must be correctly formatted CSV data, using the comma as 
      separator and the quotation mark as text delimiter; regard that 
      literal quotation marks must be doubled; there are some additional 
      restrictions: the data must not contain any line-breaks; neither 
      must they contain any asterisks nor question marks; 
    csv_out CSV data file to write the return data to; this must not be equal 
      to csv_in; note that an already existing file will be overwritten 
      without prompt; if not given, the data is displayed on the console; 

正如你可以看到,有两种操作模式:标准(无开关)和标题模式(开关/H)。

鉴于以下CSV数据送入脚本...:

A, ,C, ,E,F 
1, , ,4,5, 
1, , , ,5, 
1, ,3,4, , 

...在标准模式下返回CSV数据看起来就像是...:

A,C, ,E,F 
1, ,4,5, 
1, , ,5, 
1,3,4, , 

...并且在报头模式(/H)返回的CSV数据看起来像:

A,C,E,F 
1, ,5, 
1, ,5, 
1,3, , 

提醒的是,在上述采样数据的空间必须实际存在于文件;为了更好地说明上述操作模式,他们刚插入此处。


现在,这是完整的代码:

@echo off 
setlocal EnableExtensions DisableDelayedExpansion 

set "OPT_HEAD=%~1" 
if "%OPT_HEAD%"=="/?" (
    goto :MSG_HELP 
) else if /I "%OPT_HEAD%"=="/H" (
    shift 
) else if "%OPT_HEAD:~,1%"=="/" (
    set "OPT_HEAD=" 
    shift 
) else set "OPT_HEAD=" 

set "CSV_IN=%~1" 
if not defined CSV_IN (
    >&2 echo ERROR: no input file specified! 
    exit /B 1 
) 
set "CSV_OUT=%~2" 
if not defined CSV_OUT set "CSV_OUT=con" 

for /F "delims==" %%V in ('2^> nul set CELL[') do set "%%V=" 
setlocal EnableDelayedExpansion 
if not defined OPT_HEAD (
    for /F %%C in ('^< "!CSV_IN!" find /C /V ""') do set "NUM=%%C" 
) else set /A NUM=1 
set /A LIMIT=0 
< "!CSV_IN!" (
    for /L %%L in (1,1,%NUM%) do (
     set /P "LINE=" 
     call :PROCESS LINE LINE || exit /B !ErrorLevel! 
     set /A COUNT=0 
     for %%C in (!LINE!) do (
      set /A COUNT+=1 
      if not defined CELL[!COUNT!] set "CELL[!COUNT!]=%%~C" 
      if !LIMIT! LSS !COUNT! set /A LIMIT=COUNT 
     ) 
    ) 
) 
set "PAD=" & for /L %%I in (2,1,!LIMIT!) do set "PAD=!PAD!," 
> "!CSV_OUT!" (
    for /F usebackq^ delims^=^ eol^= %%L in ("!CSV_IN!") do (
     setlocal DisableDelayedExpansion 
     set "LINE=%%L%PAD%" 
     set "ROW=" 
     set /A COUNT=0 
     setlocal EnableDelayedExpansion 
     call :PROCESS LINE LINE || exit /B !ErrorLevel! 
     for %%C in (!LINE!) do (
      endlocal 
      set "CELL=%%C" 
      set /A COUNT+=1 
      setlocal EnableDelayedExpansion 
      if !COUNT! LEQ !LIMIT! (
       if defined CELL[!COUNT!] (
        for /F delims^=^ eol^= %%R in ("!ROW!,!CELL!") do (
         endlocal 
         set "ROW=%%R" 
        ) 
       ) else (
        endlocal 
       ) 
      ) else (
       endlocal 
      ) 


      setlocal EnableDelayedExpansion 
     ) 
     if defined ROW set "ROW=!ROW:~1!" 
     call :RESTORE ROW ROW || exit /B !ErrorLevel! 
     echo(!ROW! 
     endlocal 
     endlocal 
    ) 
) 
endlocal 

endlocal 
exit /B 


:PROCESS var_return var_string 
set "STRING=!%~2!" 
if defined STRING (
    set "STRING="!STRING:,=","!"" 
    if not "!STRING!"=="!STRING:**=!" goto :ERR_CHAR 
    if not "!STRING!"=="!STRING:*?=!" goto :ERR_CHAR 
) 
set "%~1=!STRING!" 
exit /B 


:RESTORE var_return var_string 
set "STRING=!%~2!" 
if "!STRING:~,1!"==^""" set "STRING=!STRING:~1!" 
if "!STRING:~-1!"==""^" set "STRING=!STRING:~,-1!" 
if defined STRING (
    set "STRING=!STRING:","=,!" 
) 
set "%~1=!STRING!" 
exit /B 


:ERR_CHAR 
endlocal 
>&2 echo ERROR: `*` and `?` are not allowed! 
exit /B 1 


:MSG_HELP 
echo(
echo("%~nx0" 
echo(
echo(This script removes any empty columns from CSV-formatted data. A column is con- 
echo(sidered as empty if the related fields in all rows are empty, unless the switch 
echo(/H is given, in which case the first line ^(so the header^) is evaluated only. 
echo(Notice that fields containing white-spaces only are not considered as empty. 
echo(
echo(
echo(USAGE: 
echo(
echo( %~nx0 [/?] [/H] csv_in [csv_out] 
echo(
echo( /?  displays this help message; 
echo( /H  specifies to regard the header only, that is the very first row, 
echo(   to determine which columns are considered as empty; if NOT given, 
echo(   the whole data, hence all rows, are taken into account instead; 
echo( csv_in CSV data file to process, that is, to remove empty columns of; 
echo(   these data must be correctly formatted CSV data, using the comma as 
echo(   separator and the quotation mark as text delimiter; regard that 
echo(   literal quotation marks must be doubled; there are some additional 
echo(   restrictions: the data must not contain any line-breaks; neither 
echo(   must they contain any asterisks nor question marks; 
echo( csv_out CSV data file to write the return data to; this must not be equal 
echo(   to csv_in; note that an already existing file will be overwritten 
echo(   without prompt; if not given, the data is displayed on the console; 
echo(
exit /B 
+0

它的工作原理..感谢你 –

0

假设,你原来csv看起来是这样的:

id_users,,,quantity,,date 
1,,,1,,2013 
1,,,1,,2013 
2,,,1,,2013 

那么这个单行应该解决您的要求:

(for /f "tokens=1-3 delims=," %%a in (c:\pca.csv) do echo %%a,%%b,%%c)>c:\p.csv 

导致:

id_users,quantity,date 
1,1,2013 
1,1,2013 
2,1,2013 

诀窍是:连续的分隔符被当作一个。

编辑:另一种方法,因为它证明,有比原来的问题显示更多的colums。

@echo off 
break>out.txt 
for /F "delims=" %%a in (c:\pca.csv) do call :shorten "%%a" 
goto :eof 

:shorten 
    set "line=%~1" 
:remove 
    set "line=%line:,,=,%" 
    echo %line%|find ",,">nul && goto :remove 
    echo %line%>>c:\p.csv 

break>c:\p.csv:创建OUTPUTFILE(覆盖如果存在)
与一种替代两个连续的逗号;
重复,如果还有连续的逗号。
将结果行写入outfile。

+0

这不是我原来的csv文件,我有很多空栏... –

+1

嗯 - 如果你不显示我们你真正的'csv',我们必须猜测。在你声称的评论中,它是逗号分隔的,所以我用逗号分隔的文件与emty列。请[编辑](http://stackoverflow.com/posts/37300040/edit)你的问题(你的实际的一部分) csv文件(不要从excel表复制它,而是在记事本或'type c:\ Pca.csv'中打开它并从cmd-box复制它。) – Stephan

+0

很少有数据是保密的,所以我不能共享。那里共有43个柱子,而空的柱子是(3,13,14,15,16,17,27,28,29,30,31,32,33,34) –

相关问题