2016-12-13 45 views
1

我有一个包含很多这样的块文件:如何从文件中提取最后一块

==9673== 
==9673== HEAP SUMMARY: 
==9673==  in use at exit: 0 bytes in 0 blocks 
==9673== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9673== 
==9673== All heap blocks were freed -- no leaks are possible 
==9673== 
==9673== For counts of detected and suppressed errors, rerun with: -v 
==9673== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 
.... 
.... 
.... 
.... 

==9655== 
==9655== HEAP SUMMARY: 
==9655==  in use at exit: 0 bytes in 0 blocks 
==9655== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9655== 
==9655== All heap blocks were freed -- no leaks are possible 
==9655== 
==9655== For counts of detected and suppressed errors, rerun with: -v 
==9655== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

.... 
.... 
.... 

==9699== 
==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

我要提取的最后一个块开始行:

==XXXX== HEAP SUMMARY: 

所以在我的示例我想只提取最后一个区块:

==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

我该怎么用bash做到这一点?

+1

[编辑]您输入摆脱所有的'...'S和使之成为具体的,可测试的例子。块之间的文本与块一样重要。例如,如果在每个块之间确实存在空白行,那么您所需要的只是'awk -v RS ='{s = $ 0} END {print s}'文件',并且如果每个块都是8行所有你需要的是“尾-8文件”,但如果其中任何一个真的是你的输入格式化或不。 –

回答

1

使用grep -zoP和负前瞻正则表达式:

grep -zoP '==\w{4}== HEAP SUMMARY:(?![\s\S]*==\w{4}== HEAP SUMMARY:)[\s\S]*\z' file 

==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 
  • -z会把文件终止,而不是新的行数据为空值终止
  • (?![\s\S]*==\w{4}== HEAP SUMMARY:)是负先行断言我们没有的另一个实例在下面的文件中也是一样。

RegEx Demo

1

如果你有tac,这可能是最简单的

$ tac file | awk '1; /==....== HEAP SUMMARY/{exit}' | tac 
1

如果你知道块总是9行代码,你可以简单地使用tail

tail -n9 file 
1

With sed:

$ sed -n '/HEAP SUMMARY/{:a;/ERROR SUMMARY/bb;N;ba;:b;$p;d}' infile 
==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

这里是如何工作的:

sed -n '     # Do not print lines at end of each cycle 
    /HEAP SUMMARY/ {  # If line matches "HEAP SUMMARY" 
     :a     # Label to jump back to 
     /ERROR SUMMARY/bb # If line matches "ERROR SUMMARY", jump to :b 
     N     # Append next line to pattern space 
     ba     # Jump to :a 
     :b     # Label to jump forward to 
     $p     # If we are on the last line, print pattern space 
     d     # Delete pattern space 
    } 
' infile 

每次遇到这种HEAP SUMMARY,它读取所有行到下一个ERROR SUMMARY入模式空间。然后,它检查是否已经到达最后一行;如果是,则打印模式空间,否则将被删除。

0

如果文件的最后一行也有块号,这将让该块数快速(无整个文件的阅览找哪个号码是):

n="$(tail -n1 infile | awk '{print $1}')" 

然后,我们可以选择有这样的块数结束了所有行:

tac infile | awk -vn="$n" '!($1~n){exit};1'| tac 
0

这可能会为你工作(GNU SED):

sed '/HEAP SUMMARY:/h;//!H;$!d;x' file 

遇到HEAP SUMMARY:时,用当前行替换保持空间(HS)中的任何内容。对于任何其他模式,将该行附加到HS。当模式空间(PS)与HS交换并打印出PS时,删除除最后一行外的所有行。

0

使用数据的前面,一个id /组号数:

id=$(tail -n1 file | grep -Po '(?<=\=\=)[0-9]*') && grep "$id" file |tail -n+2