2017-01-27 105 views
0

我需要的txt文件哪些内容看起来像转换:批量.TXT为.csv转换

IP Address= 10.191.128.236 
  
1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1 = PX44025A 
1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1 = 10.191.128.236 
1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1 = TRP-80G1000MB-1A 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1 = BB CKT   
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1 = NWA-078320-003 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.3.1 = EXBB    
1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1 = NWA-078332-001 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.5.1 = 3.51 
............................................. 
  
IP Address= 10.191.160.169 
  
Request timed out. 
............................................. 
  
IP Address= 10.191.128.242 
  
1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1 = PX44025D 
1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1 = 10.191.128.242 
1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1 = TRP-80G1000MB-1A 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1 = BB CKT   
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1 = NWA-078320-003 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.3.1 = EXBB    
1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1 = NWA-078332-001 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.5.1 = 3.51 
............................................. 

你可以从http://x.x.x.x/Convert/示例的源文件,如果你想测试你的脚本。 应根据“=”之前的项目准备标题,然后将“=”和“...........”之间的信息放置在一行中(每个IP地址单独一行),如下面的例子:

IP Address,1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1,1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1,1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1,1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1, 1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1,1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1,1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1,1.3.6.1.4.1.119.2.3.69.501.7.2.1.3.1,1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1, 1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1,1.3.6.1.4.1.119.2.3.69.501.7.2.1.5.1 

10.191.128.236,PX44025A,10.191.128.236,TRP-80G1000MB-1A,BB CKT,NWA-078320-003,3.10.09,3.10.09,EXBB,NWA-078332-001,3.51 
10.191.160.169,Request timed out. 
10.191.128.242,PX44025D,10.191.128.242,TRP-80G1000MB-1A,BB CKT,NWA-078320-003,3.10.09,3.10.09,EXBB,NWA-078332-001,3.51 

当然文件可以包含更多的数据,以上只是一个例子。 我试图创建自己的批处理/ f,令牌,分隔等,但最终放弃了... 任何人都可以帮助我做好准备吗?

输出将用于导入到Excel(这将允许筛选文件内容)。

下面请看我的“拼搏”:

@echo off 
setlocal enabledelayedexpansion 
echo IP Address,1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1,1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1,1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1,1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1, 1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1,1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1,1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1,1.3.6.1.4.1.119.2.3.69.501.7.2.1.3.1,1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1, 1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1,1.3.6.1.4.1.119.2.3.69.501.7.2.1.5.1 >out.csv 
for %%i in (Input.txt) do (
    set "x=" 
    for /f "tokens=2,3,4,5 delims=:=" %%a in (Input.txt) do set x=!x!%%a %%b %%c %%d, 
    set x=!x: =! 
    set x=!x:  =! 
    set x=!x:~0,-1! 
    echo !x!>>out.csv 
) 

的问题是,我不知道如何移动到下一行,必要时... 预先感谢您的支持!

+1

无任何试图正确查看文件内容的尝试,似乎很明显,这种数据转换不是批处理文件可以轻松完成的。我希望powershell能够更好地支持这种工作,如果这是一个选项,请将powershell标记添加到您的开放帖子中,以吸引更合适的观众。 – Compo

+0

这是什么?SNMP数据库的顺序和长度?元素不变? – LotPings

+0

只要不需要额外的软件,Powershell就是一个选项。不幸的是,根本不知道它,所以只有随时可用的例子可以帮助我。基于它,我可以开始学习:-)关于顺序和长度 - 它从snmp4j工具的输出抓取到txt文件。 – MrM

回答

0

我下面的解决方案作出任何有关订单的假设或每个部分的行数。即使订单有所不同,或者某些IP缺失值,它们也能正常工作。该脚本还从所有值中删除前导空格和尾随空格。

对于性能测试,我使用3660个IP地址将OP的样本数据复制到〜1.6 MB。

这是一个快速和可靠的解决方案,只要标头符合批量8kb可变大小限制即可使用。 1.6 MB文件需要24秒来处理。

@echo off 
setlocal enableDelayedExpansion 

set "input=test.txt" 
set "output=fast.csv" 

:: Clear $ variables 
for /f "delims==" %%V in ('set $ 2^>nul') do set "%%V=" 

for /f "delims== " %%N in ('findstr "^[0-9][0-9]*\." "%input%"') do set "$%%N=1" 
set "header=" 
for /f "delims=$=" %%N in ('set $') do set "header=!header!,%%N" 

>"%output%" (
    echo IP Address!header! 
    for /f "usebackq tokens=1* delims== " %%A in ("%input%") do (
    if "%%A" equ "IP" (
     set "ip=%%~nxB" 
     for %%V in (!header!) do set "$%%V=" 
    ) else if "%%A" equ "Request" (
     echo !ip:* =!,Request timed out. 
     set "ip=" 
    ) else if "%%B" equ "" (
     if "%%A" equ "............................................." if defined ip (
     set "ln=!ip:* =!" 
     for %%V in (!header!) do set "ln=!ln!,!$%%V!" 
     echo !ln! 
     set "ip=" 
    ) 
    ) else set "$%%A=%%~nxB" 
) 
) 

而且这是一个速度较慢的解决方案,应该始终工作,无论标题大小如何。该版本花费了98秒来处理1.6 MB文件。

@echo off 
setlocal enableDelayedExpansion 

set "input=test.txt" 
set "output=slow.csv" 

:: Clear $ variables 
for /f "delims==" %%V in ('set $ 2^>nul') do set "%%V=" 

for /f "delims== " %%N in ('findstr "^[0-9][0-9]*\." "%input%"') do set "$%%N=1" 

<nul >"%output%" (
    set /p "=IP Address" 
    for /f "delims=$=" %%N in ('set $') do set /p "=,%%N" 
    echo(
    for /f "usebackq tokens=1* delims== " %%A in ("%input%") do (
    if "%%A" equ "IP" (
     set "ip=%%~nxB" 
     for /f "delims=$=" %%N in ('set $') do set "_%%N=" 
    ) else if "%%A" equ "Request" (
     echo !ip:* =!,Request timed out. 
     set "ip=" 
    ) else if "%%B" equ "" (
     if "%%A" equ "............................................." if defined ip (
     set /p "=!ip:* =!" 
     for /f "delims=$=" %%N in ('set $') do set /p "=,!_%%N!" 
     echo(
     set "ip=" 
    ) 
    ) else set "_%%A=%%~nxB" 
) 
) 

编辑这里是快的代码广泛征求意见

@echo off 
setlocal enableDelayedExpansion 

set "input=%~1" 
set "output=fast.csv" 

:: Clear $ variables 
for /f "delims==" %%V in ('set $ 2^>nul') do set "%%V=" 

:: Scan entire file for a list of unique header entries 
:: only look at lines that begin number followed by a dot. 
:: For each found value, define a variable named ${address}, with a value of 1 
for /f "delims== " %%N in ('findstr "^[0-9][0-9]*\." "%input%"') do set "$%%N=1" 

:: Build a comma delimited list of addresses for the header 
:: by scanning all the $ variables 
set "header=" 
for /f "delims=$=" %%N in ('set $') do set "header=!header!,%%N" 

:: Enclose remaining code in parens and redirect once for better speed. 
>"%output%" (

    %= Print out the header line =% 
    echo IP Address!header! 

    %= Parse all lines of file into two tokens, delimited by = and/or space =% 
    %= 1* means the 2nd token can include delimiters      =% 
    for /f "usebackq tokens=1* delims== " %%A in ("%input%") do (

    if "%%A" equ "IP" (
     %= IP Address line =% 
     set "ip=%%~nxB"       %= Save the IP Address =% 
     for %%V in (!header!) do set "$%%V=" %= Clear all $ variables =% 

    ) else if "%%A" equ "Request" (
     %= Request timed out. line =% 
     (echo !ip:* =!,Request timed out.) %= Write out the "timed out" line =% 
     set "ip="       %= Clear ip so no other output for this section =% 

    ) else if "%%B" equ "" ( %= Only one token =% 
     if "%%A" equ "............................................." if defined ip (
     %= Only process if end of IP Address and ip is still defined =% 

     set "ln=!ip:* =!" %= Initialize line as IP Address =% 
          %= Remove all leading text up through the first space =% 

     %= Append the value of each $variable to line, with leading comma =% 
     %= Order of values is guaranteed to match header =% 
     for %%V in (!header!) do set "ln=!ln!,!$%%V!" 

     (echo !ln!)  %= Write the data line =% 
     set "ip="  %= Clear ip so no more output until next IP Address =% 
    ) 

    ) else set "$%%A=%%~nxB" %= Main data line - Save value in $ variable   =% 
           %= ~nx treats the value as a file name and extension =% 
           %= so trailing space(s) are removed     =% 
) 
) 

非常快JREPL.BAT解决方案

只是为了好玩,我决定使用实施解决方案- 以混合批处理/ JScript书写的正则表达式命令行文本处理器。与我的纯批处理解决方案不同,此JREPL解决方案假定所有IP地址具有相同数量的数据行,并具有相同的地址。这并不稳健,但这是大多数其他人在答案中所假定的。

使用JREPL.BAT,我剪切了input1的处理时间。txt从4.5秒降至0.8秒。但大部分时间都是花费初始化JScript。随着输入文件大小的增加,JREPL的性能真正开始发光。例如,使用我的“快速”纯批处理解决方案,1.6 MB测试文件需要24秒,而我的JREPL解决方案只需要2秒钟!

@echo off 
setlocal 

set "input=test.txt" 
set "output=jrepl.csv" 

:: Compute and write header 
call jrepl "^\d[\d.]+"^ 
      "head+=','+$0;$txt=false"^ 
      /inc "/^\d+\.//:/^\.+/"^ 
      /jbeg "var head='IP Address'"^ 
      /jend "output.WriteLine(head)"^ 
      /jmatchq /f "%input%" /o "%output%" 

:: Compute and append data 
call jrepl "^IP Address\s*=\s*([\d.]+)@^Request timed out\[email protected]^[\d.]+\s*=\s*(.*?)\s*[email protected]^[.]+"^ 
      "x=$2;$txt=false;@x+=','+$0;$txt=false;@x+=','+$5;$txt=false;@$txt=x"^ 
      /t @ /jmatchq /jbeg "var x" /f "%input%" >>"%output%" 
+0

这两个脚本都有问题。标题创建正常,但body只有第一项,然后是逗号。你可以在指向问题的源文件上测试你的批处理(只是添加链接到示例文件),如果你有一些时间和善意的帮助:-) – MrM

+0

@MrM - 我不知道你做错了什么。但是我将这个答案的第一个解决方案复制到了一个新的批处理脚本中,并将其与您的input1.txt示例进行了对比,并且运行得非常完美(4.6秒)。尽管对我的回答进行了多次编辑,但我没有对代码进行任何更改,只是对相关文本进行了更改。再次复制我的代码,然后重试。 – dbenham

+0

@MrM - 我不认为我的代码具有任何Windows版本依赖性,但是您使用的是哪个版本的Windows? – dbenham

-1

您请求的输出是无效的CSV。所有行都需要有相同数量的列。具有错误的行没有,并将错误放入随机属性不是一个好的解决方案。我建议增加一个status列来解决这个问题。我没那么熟悉先进的批处理脚本,所以这里是一个使用PowerShell的一个例子:

#Sample data 
$str = @" 
IP Address= 10.191.160.169 
  
Request timed out. 
............................................. 
  
IP Address= 10.191.128.236 
  
1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1 = PX44025A 
1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1 = 10.191.128.236 
1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1 = TRP-80G1000MB-1A 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1 = BB CKT   
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1 = NWA-078320-003 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.3.1 = EXBB    
1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1 = NWA-078332-001 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.5.1 = 3.51 
............................................. 
  
IP Address= 10.191.128.242 
  
1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1 = PX44025D 
1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1 = 10.191.128.242 
1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1 = TRP-80G1000MB-1A 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1 = BB CKT   
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1 = NWA-078320-003 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1 = 3.10.09 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.3.1 = EXBB    
1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1 = NWA-078332-001 
1.3.6.1.4.1.119.2.3.69.501.7.2.1.5.1 = 3.51 
............................................. 
"@ 

#Read from file as single string (uncomment to use) 
#$str = Get-Content -Path C:\File.txt -Raw 

#Pattern to match every "IP address <everything until> ............" to split the Devices in the input 
$pattern = "(?ms)ip address=\s+(.+?)\s+?$.+?\s+(.+?)(?=\.{10,})" 

$devices = Select-String -InputObject $str -Pattern $pattern -AllMatches | Select-Object -ExpandProperty Matches | ForEach-Object { 
    #Foreach device 
    $obj = New-Object psobject -Property @{ 
     "IP" = $_.Groups[1].Value 
     "Status" = "OK" 
    } 

    #Get values 
    $valuearray = $_.Groups[2].Value.Split("`r`n",[StringSplitOptions]::RemoveEmptyEntries) 

    #If more than one line = status ok, convert data 
    #If not, skip to else and update status. 
    if($valuearray.Count -gt 1) { 
     $valuearray| ForEach-Object { 
      #Add values 
      $name,$val = $_.Trim() -split ' = ' 
      Add-Member -InputObject $obj -MemberType NoteProperty -Name $name -Value $val 
     } 

    } else { 
     $obj.Status = $valuearray[0] 
    } 

    #Output device 
    $obj 
} 

#Get all unique properties (if first object has only IP and Status, every other would be exported with only those without this fix) 
$PropertyList = $devices | ForEach-Object { $_.psobject.Properties | ForEach-Object { $_.Name } } | Select-Object -Unique 

$devices | Select-Object -Property $PropertyList | Export-Csv -NoTypeInformation -Patch "C:\out.csv" 

输出:

"IP","Status","1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1","1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1","1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1","1.3.6.1.4.1.119.2.3.69.501.7.2.1.3.1","1.3.6.1.4.1.119.2.3.69.501.7.2.1.4.1","1.3.6.1.4.1.119.2.3.69.501.7.2.1.5.1" 
"10.191.160.169","Request timed out.",,,,,,,,,, 
"10.191.128.236","OK","PX44025A","10.191.128.236","TRP-80G1000MB-1A","BB CKT","NWA-078320-003","3.10.09","3.10.09","EXBB","NWA-078332-001","3.51" 
"10.191.128.242","OK","PX44025D","10.191.128.242","TRP-80G1000MB-1A","BB CKT","NWA-078320-003","3.10.09","3.10.09","EXBB","NWA-078332-001","3.51" 
+0

使用“..........”作为行分隔符,将“请求超时”置于第二列(首先是IP地址),剩下的列将该行留空,然后转到下一行/排在“..........”? – MrM

+0

这就需要你知道每一个属性。我会创建一个动态的东西,在这种情况下,我不知道有多少属性以及它们在执行前被调用的内容。 –

+1

不可能的东西仍然是一个答案......但是,我已经用另一种解决方案更新了答案。 –

1
@ECHO OFF 
SETLOCAL ENABLEDELAYEDEXPANSION 
SET "sourcedir=U:\sourcedir" 
SET "destdir=U:\destdir" 
SET "filename1=%sourcedir%\q41893731.txt" 
SET "outfile=%destdir%\outfile.txt" 
:: Part one - accumulate unique column1 entries from entire file 
SET "colones=" 
FOR /f "usebackqtokens=1*delims==" %%a IN ("%filename1%") DO IF "%%b" neq "" (
ECHO "!colones!"|FIND "%%a," >NUL 
IF ERRORLEVEL 1 SET "colones=!colones!,%%a" 
) 
SET "colones=%colones:~1,-1%" 
SET "colones=%colones: ,=,%" 
>"%outfile%" ECHO(%colones% 
:: Part two - accumulate column2 entries from sections 
(
SET "coltwos=" 
FOR /f "usebackqtokens=*" %%z IN ("%filename1%") DO (
FOR /f "tokens=1*delims==" %%a IN ("%%z") DO (
    REM Is this an "IP Address" line? 
    IF "%%a"=="IP Address" (
    CALL :report 
    SET "coltwos=%%b" 
    SET "nextline=" 
    SET "nodata=Y" 
) ELSE (
    REM save line following "IP Address" line 
    IF NOT DEFINED nextline SET "nextline=%%z" 
    IF "%%b" neq "" SET "nodata="&SET "coltwos=!coltwos!,%%b" 
) 
) 
) 
CALL :report 
)>>"%outfile%" 

GOTO :EOF 

:report 
IF NOT DEFINED coltwos GOTO :EOF 
SET "coltwos=%coltwos: =%" 
IF DEFINED nodata (
ECHO(%coltwos%,%nextline% 
) ELSE (
ECHO(%coltwos% 
) 
GOTO :eof 

你需要改变的sourcedirdestdir的设置适合你的情况。

我使用了一个名为q41893731.txt的文件,其中包含我的测试数据。

可生产定义为%OUTFILE%

1步读取包含=所有行每一列1条目,并累积,忽略重复的文件。然后删除流浪的前导字符和尾随字符,并用,替换所有,

由于输出要求不明确,这是做出假设。 501.7.2.1.4.1条目看起来是重复的,并且包含一个杂散空间。

第二次通过使用类似的技术来累积第二列的内容,使用IP Address行的外观来表示节已完成并因此可以进行报告。

直接在IP Address行之后的行被保存,因为如果列表中没有条目,它的内容将被简单地复制。

如果找到一个条目,则清除nodata,表示coltwos已累积报告。如果找不到数据,则coltwos + nextline包含所需的报告数据。

注意,第一通创建输出文件(因此>)和第二通追加到该文件(因此>>


修订

@ECHO OFF 
SETLOCAL ENABLEDELAYEDEXPANSION 
SET "sourcedir=U:\sourcedir" 
SET "destdir=U:\destdir" 
SET "filename1=%sourcedir%\q41893731.txt" 
SET "outfile=%destdir%\outfile.txt" 
:: Part one - accumulate unique column1 entries from entire file 
SET "colones=" 
FOR /f "usebackqtokens=1*delims==" %%a IN ("%filename1%") DO IF "%%b" neq "" (
ECHO "!colones!,"|FIND "%%a," >NUL 
IF ERRORLEVEL 1 SET "colones=!colones!,%%a" 
) 
SET "colones=%colones:~1,-1%" 
SET "colones=%colones: ,=,%" 
>"%outfile%" ECHO(%colones% 
:: Part two - accumulate column2 entries from sections 
(
SET "coltwos=" 
FOR /f "usebackqtokens=*" %%z IN ("%filename1%") DO (
FOR /f "tokens=1*delims==" %%a IN ("%%z") DO (
    REM Is this an "IP Address" line? 
    IF "%%a"=="IP Address" (
    CALL :report 
    SET "coltwos=%%b" 
    SET "nextline=" 
    SET "nodata=Y" 
) ELSE (
    REM save line following "IP Address" line 
    IF NOT DEFINED nextline SET "nextline=%%z" 
    IF "%%b" neq "" SET "nodata="&SET "coltwos=!coltwos!,%%b" 
) 
) 
) 
CALL :report 
)>>"%outfile%" 

GOTO :EOF 

:report 
IF NOT DEFINED coltwos GOTO :EOF 
SET "coltwos=%coltwos: ,=,%" 
if "%coltwos%" neq "%coltwos: ,=%" GOTO report 
SET "coltwos=%coltwos:, =,%" 
if "%coltwos%" neq "%coltwos:, =%" GOTO report 
IF DEFINED nodata (
ECHO(%coltwos:~1%,%nextline% 
) ELSE (
ECHO(%coltwos:~1% 
) 
GOTO :eof 

鉴于dbenham的批评,通过将,附加到colonesf的值来消除标题数据的重复或find以便colones似乎包含逗号。 逗号对与单个逗号直到没有更多的存在,然后除去 -

丢失的空间问题可以通过调整在该空间在:report例程删除的方式,通过改变每个空间固化领导空间通过子串在echo

+0

警告,已阅读评论 - 批处理行长度限制为约8K – Magoo

+0

Magoo,只是测试它,看起来它工作正常。将分析结果,并让你知道如果有什么不对。 – MrM

+0

@MrM - 当我运行这段代码时,会看到两个错误 - 头部中的最后一个值在最后重复(一个额外的值)。空间被剥离身体的价值 - 'BB CKT'变成'BBCKT' – dbenham

0

你应该总是描述规格的问题,而不仅仅是一个例子。下面的批处理文件解决方案创建您请求的相同的输出,但如果真实数据具有比例如数据格式不同,这一计划将失败...

@echo off 
setlocal EnableDelayedExpansion 

set "delim=............................................." 
call :procFile > out.csv 
goto :EOF 


:procFile 

rem Create the header 
< NUL (
set /P "=IP Address" 
for /F "skip=1" %%a in (input.txt) do (
    if "%%a" equ "%delim%" goto endHeader 
    set /P "=,%%a" 
)) 
:endHeader 
echo/ 

rem Create the rest of data 
set "data=" 
< NUL (
for /F "tokens=1,2 delims==" %%a in (input.txt) do (
    if "%%a" equ "%delim%" (
     echo !data:~1! 
     set "data=" 
    ) else (
     if defined data set /P "=!data:~1!," & set "data=" 
     if "%%b" neq "" (
     set "data=%%b" 
    ) else (
     if "%%a" neq " " set "data= %%a" 
    ) 
    ) 
)) 
exit /B 
+0

Aacini - 您是对的,有时格式可能有点不同。刚刚测试过你的脚本,速度非常快,但也存在一些问题。有没有机会发送你的源文件来看看? – MrM

+0

正如我之前所说,你应该_describe specifications_!用简单的英语解释实际数据与示例不同的地方以及在这种情况下所需的输出;那么,仅举几例这些案例。编辑问题以添加这些_新要求,并在此发布评论,作为您完成这些步骤的建议... – Aacini

0

这里是另一个纯解决方案,不依赖于文件中的任何特定的文本,它识别分隔线选自周期.的,并且等于-签署=,用于分隔数据头每行;如果找不到=,该行将为空(或仅包含SPACE),所述分隔符行(.)或错误消息(例如样本数据中的Request timed out.)。

该文件被for /F loop读取一次,在读取第一个数据块期间生成头文件,并且在完成时不会一次又一次地重建。如果第一个块无效或不完整,则不能收集头数据,稍后完成,然后将头写入临时文件中,因为已经有数据写入输出文件;然后标题随后与数据结合。

您的示例数据包含许多尾随空格,因此该脚本最多可以删除其中的15个。

下面是代码(也看到了众多的解释rem评论):

@echo off 
setlocal EnableExtensions DisableDelayedExpansion 

rem // Define constants here: 
set "_FILEI=%~1" & rem // (specify input file by first command line argument) 
set "_FILEO=%~2" & rem // (specify output file by second command line argument) 
set "_SEPAR=," & rem // (character to use as separator) 

rem // Get character with code `0xA0`: 
for /F %%N in ('forfiles /P "%~dp0." /M "%~nx0" /C "cmd /C echo 0xA0"') do (
    set "$A0=%%N" 
) 

rem // Process input file in sub-routine, write output file: 
> "%_FILEO%" call :PROCESS "%_FILEI%" TEMPF 
rem /* Variable `TEMPF` is usually empty, so the following code does not run; 
rem however, if the first block in the input file is invalid/incomplete, 
rem the output header is assembled belatedly, so it is stored in a temporary 
rem file and put together with the collected data at the end: */ 
if defined TEMPF (
    if exist "%TEMPF%" (
     > nul (
      copy /Y /B "%TEMPF%"+"%_FILEO%" "%TEMPF%" 
      move /Y "%TEMPF%" "%_FILEO%" 
     ) 
    ) else (
     >&2 echo ERROR: no valid data block encountered, could not built header! 
     exit /B 1 
    ) 
) 

endlocal 
exit /B 


:PROCESS val_file rtn_temp_file 
    setlocal DisableDelayedExpansion 
    rem // Initialise variables (flags and buffers): 
    set "DONE=" & set "NEXT=" & set "TMPF=" 
    set "HEAD=%_SEPAR%" & set "COLL=%_SEPAR%" 
    rem /* Read input file line by line, split at first `=` sign; 
    rem let us call the left part key and the right one value: */ 
    for /F "usebackq eol== tokens=1,* delims==" %%K in ("%~1") do (
     set "KEY=%%K" 
     set "VALUE=%%L" 
     setlocal EnableDelayedExpansion 
     rem // Remove potential trailing space from key: 
     if "!KEY:~-1!"==" " set "KEY=!KEY:~,-1!" 
     rem // Remove potential trailing character with code `0xA0` from key: 
     if "!KEY:~-1!"=="!$A0!" set "KEY=!KEY:~,-1!" 
     if defined VALUE (
      rem // Remove potential trailing spaces from value: 
      for %%N in ("  " " " " ") do (
       set "VALUE=!VALUE:%%~N=!" 
      ) 
      if "!VALUE:~-1!"==" " set "VALUE=!VALUE:~,-1!" 
      rem // Remove potential leading space from value: 
      if "!VALUE:~,1!"==" " set "VALUE=!VALUE:~1!" 
      rem /* Build output lines by concatenating values and also 
      rem header lines out of keys; toggle delayed expansion 
      rem in order to not lose exclamation marks; use `for /F` 
      rem loop to transport values over `endlocal` barrier: */ 
      for /F "delims=" %%M in ("!COLL!!VALUE!!_SEPAR!") do (
       if defined DONE (
        rem // Header has alread y been built: 
        endlocal 
        set "COLL=%%M" 
        setlocal EnableDelayedExpansion 
       ) else (
        rem // Header has not yet been built, so do it: 
        for /F "delims=" %%N in ("!HEAD!!KEY!!_SEPAR!") do (
         endlocal 
         set "HEAD=%%N" & set "COLL=%%M" 
         setlocal EnableDelayedExpansion 
        ) 
       ) 
      ) 
     ) else (
      rem /* The value is empty, so there are two possibilities: 
      rem either the `.` separator line, or an error message: */ 
      if "!KEY:.=!"=="" (
       rem // Line consists of `.` only (separator line): 
       if not defined DONE (
        rem // Header has not yet been written: 
        if not defined NEXT (
         rem // Header is not postponed: 
         if defined TMPF (
          rem // Write belated header to temporary file: 
          > "!TMPF!" ((echo(!HEAD:~1,-1!) & echo/) 
         ) else (
          rem // Normally, return header immediately: 
          ((echo(!HEAD:~1,-1!) & echo/) 
         ) 
        ) 
       ) 
       rem // Write collected output data line: 
       echo(!COLL:~1,-1! 
       endlocal 
       rem // Indicate by a flag that header has been completed: 
       if not defined NEXT set "DONE=#" 
       rem // Reset variables (flags and buffers): 
       set "NEXT=" & set "HEAD=%_SEPAR%" & set "COLL=%_SEPAR%" 
       setlocal EnableDelayedExpansion 
      ) else if defined KEY (
       rem // Line contains error message (no `=` sign found): 
       for /F "delims=" %%M in ("!COLL!!KEY!!_SEPAR!") do (
        for /F "delims=" %%N in ("!HEAD!!KEY!!_SEPAR!") do (
         endlocal 
         rem /* Header cannot be written immediately, so create 
         rem path to temporary file to receive the header: */ 
         if not defined DONE set "TMPF=%TEMP%\%~n0_%RANDOM%-1.tmp" 
         set "NEXT=#" & set "HEAD=%%N" & set "COLL=%%M" 
         setlocal EnableDelayedExpansion 
        ) 
       ) 
      ) 
     ) 
     endlocal 
    ) 
    rem // If applicable, transport temporary file path over `endlocal` barrier: 
    (
     endlocal 
     set "%~2=%TMPF%" 
    ) 
    exit /B 
+0

我有问题来运行您的脚本(它启动时找不到临时文件)。它创建输出,但没有标题。此外还会添加一些空白空间(可能是因为它无法从临时文件中获取某些内容)。请使用我在问题中共享的源文件(仅编辑)测试脚本,如果您愿意的话。 – MrM

+0

问题是你的输入文件包含代码为'0xA0'的字符,我认为它是“普通”空格;我现在修复了脚本...请注意,它不使用临时文件,除非输入文件的第一个块中包含“请求超时”而非正常有效数据的情况(罕见)。 – aschipfl

+0

甚至不知道0xA0在那里:-)你的脚本现在工作正常。非常感谢您的支持!感谢代码中非常有用的评论 - 它允许更好地理解代码(特别是像我这样的新手......)。 – MrM

0

这个答案的目的有两个:

  • 证明的力量和柔韧性PowerShell,它允许一个健壮,合理简洁和可读的解决方案,表现相当好。

  • 要显示来自cmd.exe(常规命令提示符),以及如何如何调用基于PowerShell的脚本解决方案的包装批处理文件的帮助,使这个过程更容易。


如果创建文件Transform.ps1与张贴在底部PowerShell代码,你可以从cmd.exe(常规命令提示符)如下调用它来获得所需的输出:

powershell -ExecutionPolicy Unrestricted -NoProfile -File .\Transform.ps1 input.txt out.csv 

* 谨慎使用-ExecutionPolicy Unrestriced:只能将它与您信任的脚本一起使用 - se e Get-Help about_Execution_Policies
*要使用除默认字符编码以外的字符编码,可以使用-Encoding参数 - 请参见下文。
* -NoProfile跳过加载PowerShell配置文件(用于交互式使用的初始化命令)。

如果创建包装批处理脚本Transform.cmd下面张贴的内容,并将其放置在同一目录中的PowerShell脚本,您可以简化调用到:

Transform input.txt out.csv 

内容包装批文件Transform.cmd

@powershell.exe -ExecutionPolicy Unrestricted -NoProfile -File "%~dpn0.ps1" %* 

此包装作品一般,只要把批处理文件的文件名根和PowerShell脚本是相同的,在Transform.cmd和PowerShell脚本的Transform.ps1Transform.ps1


内容:

  • 兼容性:代码使用PowerShell v3 +功能,但它也可以与PSv2一起使用。

  • 字符编码:默认情况下,PowerShell的Default编码使用,这是由系统的传统Windows代码页设置隐含的单字节,扩展ASCII编码。

    • 您可以更改编码通过传递-Encoding <encoding>来匹配输入文件的,但要注意,输出文件都会不约而同地使用相同的编码(虽然这将是容易改变)。
    • 接受的编码名称为Unicode, BigEndianUnicode, UTF8, UTF7, UTF32, Ascii, Default, Oem, BigEndianUTF32(见-Encoding参数说明here),但是请注意,Excel不能识别CSV文件作为这样一些人,尤其是不Unicode(UTF-16LE)
  • 性能:处理问题中链接的每个示例文件在我2011年下半年的MacBook Air上的使用时间不到2秒,因此,除非文件太大,否则性能大概可以接受。

  • 评论:该代码被评为高级别,这足以作为进一步探索的出发点。

# Declare the script's parameters. 
[CmdletBinding(PositionalBinding=$false)] 
param(
    [Parameter(Mandatory, Position=1)] [string] $InPath, 
    [Parameter(Mandatory, Position=2)] [string] $OutPath, 
    [Microsoft.PowerShell.Commands.FileSystemCmdletProviderEncoding] $Encoding = 'Default' 
) 

# We anticipate no errors, so let's treat even non-terminating ones as 
# terminating. 
$ErrorActionPreference = 'Stop' 

# The line that separates records. 
$recSeparator = '.............................................' 

# Get the first record as a single string, so we can later extract the 
# CSV column headers from it. 
Write-Verbose "Parsing first record to determine header names..." 
$firstRec = Get-Content $InPath -Delimiter $recSeparator -Encoding $Encoding | 
    Select-Object -First 1 

# Split the first record into an array of lines that contain "=" 
$firstRecLines = ($firstRec -split '\r?\n') -match '=' 

# Synthesize the CSV header line from the array of header names obtained from 
# the LHS of the "=" on each line. 
$colHeaderList = ($firstRecLines -replace '^([^=]+).*', '$1').Trim() -join ',' 

# Write the header line plus a blank line to the output file. 
# The output encoding. 
Write-Verbose "Writing header names to '${OutPath}' using encoding '${Encoding}'.." 
$colHeaderList + [Environment]::NewLine | Set-Content $OutPath -Encoding $Encoding 

# Now process all records and add a line of data fields (only) for each. 
Write-Verbose "Processing all records and appending data rows..." 
Get-Content $InPath -Delimiter $recSeparator -Encoding $Encoding | ForEach-Object { 
    # Split the record at hand into an array of lines... 
    $lines = $_ -split '\r?\n' 
    # ... and extract the values (the RHS of "=") as an array. 
    $values = $lines -match '=' -replace '^.*= (.*)', '$1' 
    # Process the values based on how many were found in the record. 
    switch ($values.count) { 
    0 { return } # Assumed to be the empty record at the end -> ignore 
    1 { # Only 1 value? -> must be a "request timed out" record. 
     $valuesList = $values[0].Trim() + ',' + $lines[-2] 
     break 
    } 
    default { # regular data record 
     # Join the trimmed values with ',' to form a CSV data line. 
     $valuesList = ($values).Trim() -join ',' 
    } 
    } 
    # Append the data line to the output file. 
    # The output encoding. 
    Add-Content -Value $valuesList $OutPath -Encoding $Encoding 
} 

Write-Verbose "Processing to output file '${OutPath}' completed successfully" 
0
  1. 读取文件,并做一些空行/换行的具体拆除。
  2. 之后删除“请求超时”行,因为它们会破坏块模式。
  3. 正则表达式将块按摩到PowerShell散列表/对象形状。
  4. eval将它们转换为活动对象,然后将它们转换为CSV。
  5. 将“请求超时”行添加到CSV的末尾。
  6. 它已经是正则表达式沉重且难以理解了,问题得到了回答和接受,为什么不编码呢。

〜339个字符:

$T,$B=(((gc -raw Input2.txt)-replace"(?m)^\s*`r`n"-replace 
"(?i)\s*`r`n(?=R)",", ")-split"`r`n").where({$_-match'timed out'},'split') 
((($B-join"`r`n")-replace"(?m)^(.+?)\s*=\s*(.+?)\s+$","'`$1'='`$2';"-replace 
"(?m)^\.[.`r`n]+","},"-replace"'IP","[PSCustomObject]@{'IP").TrimEnd(",")| 
iex|ConvertTo-Csv -N)[email protected]($T-replace'^.*= ')|sc out.csv 

输出示例:

"IP Address","1.3.6.1.4.1.119.2.3.69.5.1.1.1.3.1","1.3.6.1.4.1.119.2.3.69.5.1.1.1.6.1","1.3.6.1.4.1.119.2.3.69.501.7.10.1.3.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.3.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.4.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.7.1","1.3.6.1.4.1.119.2.3.69.501.7.1.1.1.8.1" 
"10.192.6.199","PI13217A","10.192.6.199","MDP-400MB-1AA","MC-A4","NWA-055298-101","3.00.37","3.00.37" 
"10.192.28.73","PI11747A","10.192.28.73","MDP-400MB-1AA","MC-A4","NWA-055298-101","3.00.37","3.00.37" 
"10.192.28.74","PI12844A","10.192.28.74","MDP-400MB-1AA","MC-A4","NWA-055298-101","3.00.37","3.00.37" 
"10.192.28.75","PI12604A","10.192.28.75","MDP-400MB-1AA","MC-A4","NWA-055298-101","3.02.20","3.02.20" 
"10.192.28.78","PI14189A","10.192.28.78","MDP-400MB-1AA","MC-A4","NWA-055298-101","3.00.37","3.00.37" 
10.192.15.137, Request timed out. 
10.192.16.144, Request timed out. 
10.192.136.201, Request timed out. 
10.192.1.199, Request timed out. 
10.192.153.132, Request timed out. 

(是最后一行中没有报价以同样的方式,但2013的Excel处理它)