如何在匹配模式之间处理字段？

@fedorqui，感谢您为awk提供所有这些不同的选项。我一直在使用它来解析ooms时通过var日志消息进行解析，并且它工作的很好。我想进一步扩展这一点，但我一直无法弄清楚如何继续。我正在尝试做什么：如何在匹配模式之间处理字段？

打印rss和内存不足之间的界限。我已经用这个例子做了
通过rss字段为每个匹配之间的部分排序。我一直无法解决这个问题
添加一个额外的列与自己的头，并执行一些数学运算。我已经能够做到这一点，但我遇到了一些格式问题。我不知道如何在添加列时跳过第一行和最后一行，以免丢失这些行。如果我进行除印刷以外的任何操作，我也无法保持与原稿的间距。

下面是我使用的是现在的命令：

less /var/log/messages'|awk '/swapents/{x=1; print "=================="};/Out of memory/{x=0} x'|sed 's/[]\[]//g'

这里的源数据：

Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.617265 pid uid tgid total_vm  rss nr_ptes nr_pmds swapents oom_score_adj name 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.622250 1828  0 1828  4331  116  14  3  0   -1000 udevd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.627310 2664  0 2664 28002  53  23  3  0   -1000 auditd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.633181 2680  0 2680 62032  1181  24  4  0    0 rsyslogd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.638888 2694  0 2694  3444  61  11  3  0    0 irqbalance 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.644912 2710 81 2710  5430  56  14  3  0    0 dbus-daemon 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.651108 2779  0 2779 19958  203  42  3  0   -1000 sshd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.656670 2789  0 2789  5622  56  17  3  0    0 xinetd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.653452 Out of memory: Kill process 43390 (mysql) score 1000 or sacrifice child 
blah 
blah 
blah 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.617265 pid uid tgid total_vm  rss nr_ptes nr_pmds swapents oom_score_adj name 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.622250 1828  0 1828  4331  116  14  3  0   -1000 udevd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.627310 2664  0 2664 28002  53  23  3  0   -1000 auditd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.633181 2680  0 2680 62032  1181  24  4  0    0 rsyslogd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.638888 2694  0 2694  3444  61  11  3  0    0 irqbalance 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.644912 2710 81 2710  5430  56  14  3  0    0 dbus-daemon 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.651108 2779  0 2779 19958  203  42  3  0   -1000 sshd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.656670 2789  0 2789  5622  56  17  3  0    0 xinetd 
Sep 8 11:35:15 ip-10-23-15-70 kernel: 11810061.653452 Out of memory: Kill process 43390 (mysql) score 1000 or sacrifice child

这里是我的输出是什么样子：

==================  0MB 
Sep 8 11:35:15 pid 0MB name   <---- should be header (Pid virt rss etc) 
Sep 8 11:35:15 1828 0MB udevd 
Sep 8 11:35:15 2664 0MB auditd 
Sep 8 11:35:15 2680 4MB rsyslogd 
Sep 8 11:35:15 2694 0MB irqbalance 
Sep 8 11:35:15 2710 0MB dbus-daemon 
Sep 8 11:35:15 2779 0MB sshd 
Sep 8 11:35:15 2789 0MB xinetd 
Sep 8 11:35:15 2822 0MB crond 
Sep 8 11:35:15 Out 0MB or   <---- should be footer (out of memory etc) 
==================  0MB 
Sep 8 11:35:15 pid 0MB name   <---- should be header (Pid virt rss etc) 
Sep 8 11:35:15 1828 0MB udevd 
Sep 8 11:35:15 2664 0MB auditd 
Sep 8 11:35:15 2680 4MB rsyslogd 
Sep 8 11:35:15 2694 0MB irqbalance 
Sep 8 11:35:15 2710 0MB dbus-daemon 
Sep 8 11:35:15 2779 0MB sshd 
Sep 8 11:35:15 2789 0MB xinetd 
Sep 8 11:35:15 2822 0MB crond 
Sep 8 11:35:15 Out 0MB or   <---- should be footer (out of memory etc) 
==================  0MB

你可以从输出中看到我为每个oom字段添加分隔符，awk尝试为它计算值，如果可能，我很乐意避免这种情况。此外，页眉和页脚正在被切断，这也很好，以避免这一点。

这是我想什么：

======================== 
Sep 8 11:35:15 pid rss memused_MB  oom_score_adj name 
Sep 8 11:35:15 2664 53 {rss*4/1024}   -1000 auditd 
Sep 8 11:35:15 2789 56 {rss*4/1024}    0 xinetd 
Sep 8 11:35:15 2710 56 {rss*4/1024}    0 dbus-dae 
Sep 8 11:35:15 2694 61 {rss*4/1024}    0 irqbalan 
Sep 8 11:35:15 1828 116 {rss*4/1024}   -1000 udevd 
Sep 8 11:35:15 2680 181 {rss*4/1024}    0 rsyslogd 
Sep 8 11:35:15 2779 203 {rss*4/1024}   -1000 sshd 
Sep 8 11:35:15 Out of memory: Kill process 43390 (mysql) score 1000 or sacrifice child 
======================== 
Sep 8 11:35:15 pid rss memused_MB  oom_score_adj name 
Sep 8 11:35:15 2664 53 {rss*4/1024}   -1000 auditd 
Sep 8 11:35:15 2789 56 {rss*4/1024}    0 xinetd 
Sep 8 11:35:15 2710 56 {rss*4/1024}    0 dbus-dae 
Sep 8 11:35:15 2694 61 {rss*4/1024}    0 irqbalan 
Sep 8 11:35:15 1828 116 {rss*4/1024}   -1000 udevd 
Sep 8 11:35:15 2680 181 {rss*4/1024}    0 rsyslogd 
Sep 8 11:35:15 2779 203 {rss*4/1024}   -1000 sshd 
Sep 8 11:35:15 Out of memory: Kill process 43390 (mysql) score 1000 or sacrifice child 
========================

来源

2017-09-10 user8588010

AWK解决方案：

$ cat tst.awk 
/swapents/ { 
    x=1; 
    print "==================" 
    printf("%s %s %s pid\t%4s\tmemused_MB\toom_score_adj\tname\n", $1, $2, $3, "rss"); 
    next 
} 
/Out of memory/ { 
    printf("%s %s %s %s\n", $1, $2, $3, substr($0,index($0,$7))); 
    x=0 
} 
x { 
    printf("%s %s %s %s\t%4d\t%10.5f\t%13d\t%s\n", $1, $2, $3, $7, $11, ($11*4)/1024, $15, $16) 
}

你可以玩的格式，如第6列的精度，在printf的使用说明符功能。调用此方法：

$ awk -f tst.awk /var/log/messages

编辑：与分类

OP 要求通过RSS列排序输出。使用标准sort在这里不起作用，因为您想在起始和结束匹配之间进行排序。您可以通过将中间结果保存在数组中并使用自定义函数对其进行排序来解决此问题。就像这样：

$ cat tst2.awk /swapents/ { x=1; print "==================" printf("%s %s %s pid\t%4s\tmemused_MB\toom_score_adj\tname\n", $1, $2, $3, "rss"); next } /Out of memory/ { n=asort(a, sorted, "cmp_rss") for (i=1; i<=n; i++) { print sorted[i] } delete a; printf("%s %s %s %s\n", $1, $2, $3, substr($0,index($0,$7))); x=0 } x { a[i++] = sprintf("%s %s %s %s\t%4d\t%10.5f\t%13d\t%s", $1, $2, $3, $7, $11, ($11*4)/1024, $15, $16); } function cmp_rss(i1, v1, i2, v2) { split(v1, a1, " ") split(v2, a2, " ") rss1=a1[5]; rss2=a2[5]; return (rss1 - rss2) }

导致：基于马克Lambrichs响应我能创造这一个衬垫，没有工作

$ awk -f tst2.awk input.txt ================== Sep 8 11:35:15 pid rss memused_MB oom_score_adj name Sep 8 11:35:15 2664 53 0.20703 -1000 auditd Sep 8 11:35:15 2710 56 0.21875 0 dbus-daemon Sep 8 11:35:15 2789 56 0.21875 0 xinetd Sep 8 11:35:15 2694 61 0.23828 0 irqbalance Sep 8 11:35:15 1828 116 0.45312 -1000 udevd Sep 8 11:35:15 2779 203 0.79297 -1000 sshd Sep 8 11:35:15 2680 1181 4.61328 0 rsyslogd Sep 8 11:35:15 Out of memory: Kill process 43390 (mysql) score 1000 or sacrifice child ================== Sep 8 11:35:15 pid rss memused_MB oom_score_adj name Sep 8 11:35:15 2664 53 0.20703 -1000 auditd Sep 8 11:35:15 2710 56 0.21875 0 dbus-daemon Sep 8 11:35:15 2789 56 0.21875 0 xinetd Sep 8 11:35:15 2694 61 0.23828 0 irqbalance Sep 8 11:35:15 1828 116 0.45312 -1000 udevd Sep 8 11:35:15 2779 203 0.79297 -1000 sshd Sep 8 11:35:15 2680 1181 4.61328 0 rsyslogd Sep 8 11:35:15 Out of memory: Kill process 43390 (mysql) score 1000 or sacrifice child

来源

2017-09-11 05:09:02

哎真棒解决方案。这是我用过的。 /swetants/{ x = 1; print“==================” gsub（/ \ [| \] /，“”） print $ 0“\ t \ t”$ 11 next } {GSUB（/ \ [| \]/“”）} /输出的存储器/ { 打印$ 0 x = 0的 } X {printf的 “％S \吨\吨％10.2f \ n” 个， $ 0，$ 11 * 4/1024} 虽然这并不排序。有没有办法在命令行上使用它而不创建文件？我无法在正在使用的机器上创建文件。 – user8588010

添加rss列的排序示例。不创建文件？当然，只是'awk -f tst2.awk input.txt'不会为你创建一个文件。 –

感谢您的排序方法，这很酷。通过不创建文件，我的意思是我不能在机器上创建awk文件，我将运行它。看起来我可能必须从外部进行排序:(感谢您的帮助，现在可以做到，无需排序即可生活 – user8588010

。非常感谢。现在唯一缺少的东西是由RSS列进行排序，我一直没能得到RSS字段排序虽然

less /var/log/messages|awk '/swapents/ {x=1; print "==================";gsub(/\[|\]/, "") ;printf "%s %s %s %s %10s %10s %10s %15s memory_used %-s\n", $1,$2,$3,$4,$7,$10,$11,$15,$16 ;next } {gsub(/\[|\]/, "")} /Out of memory/ {print $0 ;x=0 } x {printf "%s %s %s %s %10s %10s %10s %15s %9.2fMB %-s\n", $1,$2,$3,$4,$7,$10,$11,$15,$11*4/1024,$16}'

为了提高可读性，awk的代码格式：

/swapents/ { 
    x=1; 
    print "=================="; 
    gsub(/\[|\]/, ""); 
    printf "%s %s %s %s %10s %10s %10s %15s memory_used %-s\n", $1,$2,$3,$4,$7,$10,$11,$15,$16 ; 
    next 
} 
{ 
    gsub(/\[|\]/, "") 
} 
/Out of memory/ { 
    print $0; 
    x=0 
} 
x { 
    printf "%s %s %s %s %10s %10s %10s %15s %9.2fMB %-s\n", $1,$2,$3,$4,$7,$10,$11,$15,$11*4/1024,$16 
}

来源

2017-09-15 01:59:53 user8588010

在我的答案中添加了排序示例。 –

如何在匹配模式之间处理字段？

回答

相关问题