2014-01-19 31 views
0

我的输入文件:处理与AWK一个文本文件,sed和grep的

20110512075615 Constanta 1.0041 1013.41 9999.0 0 0.0 0 
20110512075630 Constanta 1.0021 1013.45 9999.0 0 0.0 0 
20110512075645 Constanta 1.0031 1013.47 9999.0 0 0.0 0 
20110512075700 Constanta 1.0018 1013.47 9999.0 0 0.0 0 
20110512075730 Constanta 1.0038 1013.48 9999.0 0 0.0 0 
20110512075745 Constanta 1.0023 1013.48 9999.0 0 0.0 0 
20110512075800 Constanta 9999.0000 1013.46 13.2 0 0.0 0 
20110512075815 Constanta 1.0038 1013.45 13.2 0 0.0 0 
20110512075830 Constanta 1.0040 1013.50 13.2 0 0.0 0 
20110512075845 Constanta 1.0034 1013.50 13.2 0 0.0 0 
20110512075900 Constanta 1.0050 1013.45 13.2 0 0.0 0 
20110512075915 Constanta 1.0060 1013.48 13.2 0 0.0 0 
20110512075930 Constanta 1.0056 1013.45 13.2 0 0.0 0 
20110512080000 Constanta 1.0066 1013.50 13.2 0 0.0 0 
20110512080015 Constanta 1.0067 1013.49 13.2 0 0.0 0 
20110512080100 Constanta 1.0065 1013.48 13.2 0 0.0 0 
20110512080115 Constanta 9999.0000 1013.51 13.2 0 0.0 0 
20110512080130 Constanta 1.0065 1013.51 13.2 0 0.0 0 
20110512080145 Constanta 1.0079 1013.49 13.2 0 0.0 0 
20110512080200 Constanta 1.0072 1013.51 13.2 0 0.0 0 
20110512080215 Constanta 1.0084 1013.51 13.2 0 0.0 0 

我的输出文件:

YY/MM/DD HH -Level- Atm.Prs -Tw- 
    201105120757  1.0018 1013.47 9999.0  0 0.0  0 
    201105120759  1.0050 1013.45 13.2  0 0.0  0 
    201105120800 9999.0000  1.0066 1013.50 13.2  0 0.0  0 
    201105120801  1.0065 1013.48 13.2  0 0.0  0 
    201105120802 9999.0000  1.0072 1013.51 13.2  0 0.0  0 

我的代码:

#! /bin/bash 
    FILE="Constanta20110513.txt" 
    # 1) remove column two(='Constanta') 
    awk '{$2="";print}' $FILE | column -t > tmpfile 
    # 2) remove lines with '9999.0000' 
    cat tmpfile | sed -e '/9999.[0-9]/d' >> final.tmp 
    # 3) remove first three lines 
    awk 'NR>3' final.tmp >> myfile.tmp 
    # 4) count lines between '....00' si '....00': 
    #if >= 3, keep only the line with '...00' and delete the other lines 
    #if < 3, do the same, and put '9999' on column two 

    output=$(grep -n '00\s*$' myfile.tmp | sed 's/\s*$/ /') 
    array=($output $(cat myfile.tmp | wc -l)) 

    for ((i=0; i<${#array[@]}-1; i++)); do 
    index1=$(echo "${array[$i]}" | grep -o '^[0-9]*') 
    index2=$(echo "${array[$i+1]}" | grep -o '^[0-9]*') 

    if [ $((index2 - index1)) -ge 3 ]; then 
     echo $(echo "${array[$i]}" | grep -o '[0-9]*$') >> temp.tmp 
    else 
     echo $(echo "${array[$i]}" | grep -o '[0-9]*$') 9999.0000 >> temp.tmp 
    fi 

    done 

    # 5) delete last two characters from first column(=00) 
    awk '{sub(/..$/,"",$1)} 1' temp.tmp >> output.tmp 
    # 6) insert header 
    echo 'YY/MM/DD HH -Level- Atm.Prs -Tw-' | cat - output.tmp >> output2.tmp 
    #save 
    mv output2.tmp $FILE 

我的问题是,在步骤4:不工作,临时文件temp.tmp不是创建的。 我认为问题在这里:grep -n '00\s*$' myfile.tmp | sed 's/\s*$/ /'

非常感谢您提前。

+0

你能否告诉我们输出你的愿望? – Beta

回答

0

这里是#1至#3一气呵成:

awk '{$2="";sub(/ /," ")} !/9999.[0-9]/ && t++>2' $FILE 

不知道你喜欢第4步算什么,你能不能让一些更加清晰。

+0

在步骤4中,我想要计算结尾(最后两个字符)处具有值'00'的行之间的行(第一列);如果结果是> = 3:只保留最后有'00'的行,并删除另一个;如果结果是<3:做相同的操作,但在第二栏插入'9999.0000' – tuxman

0

我基于Jotne的工作#1-3,并添加了处理#4的函数。以下应被放入一个可执行文件(我称之为awko)和运行像awko Constanta20110513.txt

#!/usr/bin/awk -f 

BEGIN { print "YY/MM/DD HH -Level- Atm.Prs -Tw-" } 

# absorb jotne's work for #1-3 more or less 
{$2="";sub(/ /," ")} 
/9999.0000/ || NR<=3 { next } 

/^[0-9]{12}00/ { output_line() } # deal with the "00" lines 

END { output_line() } # output the final "00" stored in last 

function output_line() { 
    if(last_nr != 0) { 
     if(NR-last_nr < 3) { 
      temp = $0   # save off the current line 
      $0 = last   # reset it to the last "00" line 
      $2 = "9999.0000" # make $2 what you want 
      print $0 
      $0 = temp   # restore $0 from temp 
     } 
     if(NR-last_nr >= 3) { print last } 
    } 
    $1 = substr($1, 1, 12) # drop the "00" from $1 
    last = $0; last_nr = NR; # store some variables 
    } 

我从你指定的输入输出如下:

YY/MM/DD HH -Level- Atm.Prs -Tw- 
201105120757 1.0018 1013.47 9999.0 0 0.0 0 
201105120759 1.0050 1013.45 13.2 0 0.0 0 
201105120800 9999.0000 1013.50 13.2 0 0.0 0 
201105120801 1.0065 1013.48 13.2 0 0.0 0 
201105120802 9999.0000 1013.51 13.2 0 0.0 0 
+0

@ n0741337完美!非常非常感谢你! – tuxman

+0

@tuxman - 我很高兴它为你工作。 – n0741337