2016-06-11 12 views
2

我有一个记录文件是这样的:阅读日志文件中提取各个领域和计数的发生

2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 13794017 with status : 201 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 13794017 with status : 201 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 13794017 with status : 201 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() Scene7 update for 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken= 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() Scene7 update for 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken= 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() Scene7 update for 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken= 
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute EXIT 
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute EXIT 
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute EXIT 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 17696532 with status : 500 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 17696532 with status : 500 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 17696532 with status : 500 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() Scene7 update for 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken= 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() Scene7 update for 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken= 
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute() Scene7 update for 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken= 
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute EXIT 
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute EXIT 
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO JobLoader c.t.c.w.b.JobParserBolt - execute EXIT 

有几百个这样的日志具有不同partnumbers和状态代码的重复。我想将不同于201的状态代码的不同部分存储到单独的文件中,以便我们可以轻松地进行监控。虽然我想有201个成功职位的人数。所以,我想从这个样本输出应该看起来像:

No. of partnumbers with Status 201: 1 
Partnumbers with Status 500: 17696532, ... , ... 
Partnumbers with Status 401: ... ,... 

我先用awk,但后来解析并不那么容易。另请注意,相同的partnumber会多次出现,我如何添加一张支票,以便我不会多次计算单个partnumber。

我的代码至今:

awk -F'Enrichment data updated successful for partnumber :' '{print $2}' file.log |rev | cut -c 4- | rev 

我想先这样提取的部分号码,但我不能够应用检查,以避免多个部分号码的问题,并涉及其相应的状态代码与它。

回答

2

这是使用awk解决的问题。请参阅内嵌评论以获取解释。

awk '/Enrichment data updated successful for partnumber/ { 
    # store the results as a multidimensional array with the first key 
    # being the status and the key of the second array being the product 
    # number. This removes duplicates because array keys must be unique 
    arr[$NF][$16]++ 
} 
END { 
    # iterate over the 201 status items and count them 
    for (item in arr[201]) { 
     count++ 
    } 
    print "No. of partnumbers with Status 201: " count 

    # iterate over the status array 
    for (status in arr) { 
     # skip 201 status 
     if (status == 201) 
      continue 
     # join the array by "," for printing 
     # taken from http://stackoverflow.com/a/13648609/1032785 
     joined = sep = "" 
     for (product in arr[status]) { 
      joined = joined sep product 
      sep = "," 
     } 

     print "Partnumbers with Status " status ": " joined 
    } 
} 
' foo.log 

这将产生与样品日志文件下面的输出,我添加了一些额外的行:

No. of partnumbers with Status 201: 1 
Partnumbers with Status 401: 17623039 
Partnumbers with Status 500: 17696532, 17696539 
+1

您应该提及,需要GNU awk才能获得真正的多维数组。 –

+0

@jordanm只是想知道,在NFR中NF是什么? –

+0

@jordanm,如果我有 2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt]信息JobLoader ctcommerce.common.utils.APIUtils - 丰富数据已成功更新为partnumber:17696532状态:503服务不可用,那么我可以使用NF还是我必须使用$ 20? –

0

AWK,使用datamashpee

echo -n "No. of partnumbers with Status 201: " ; \ 
grep "status : " file.log | pee \ 
    'grep ": 201" | datamash -W -s countunique 16' \ 
    'grep -v ": 201" | datamash -W -s -g20 unique 16 | \ 
     sed "s/^[0-9]*/Partnumbers with Status &:/;s/,/, /g"' 

输出,(使用来自OP的采样数据):

No. of partnumbers with Status 201: 1 
Partnumbers with Status 500: 17696532