2016-02-15 114 views
0

用下面的文本处理日志文件的程序。 请帮助了解如何打印组件列表(在日期和时间之后),根据消息在日志(第一个单词)中的重要性排列它们。 按字母排序字符串

例如,组件A应该位于组件B之前的列表中,如果组件B包含更多具有最重要级别的消息。

ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen 
    INFO - 2015 Dec 28 02:02:56 - mangiest_ima - 144 - overrealistically decadently unfierce edris 
    CRITICAL - 2015 Dec 27 20:04:02 - unanticipated_konnor - 44 - amusively sensationally turbanlike rico 
    INFO - 2015 Dec 28 08:12:06 - unfulminating_deacon - 123 - eruptively nonmodally sebacic shavonda 
    CRITICAL - 2015 Dec 28 08:04:27 - unanticipated_konnor - 1213 - unchastely priorly monophyletic cullen 
    ERROR - 2015 Dec 28 07:39:36 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia 
    DEBUG - 2015 Dec 27 16:44:47 - mangiest_ima - 144 - questingly substitutionally uncompensative jen 
    ERROR - 2015 Dec 26 17:49:26 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia 

EXPECTED OUTPUT: 
unanticipated_konnor 
furnacelike_marlene 
unfulminating_deacon 
mangiest_ima 

我已经取得了一些代码,这算个组件的消息的频率,但我不知道,它可以帮助:

from collections import Counter 
file = open('C:\\Users\\User\\Downloads\\tasks\\logs\\1.txt', "r+") 
warnList = [] 
for line in file: 
    warnList.append(line.split(' - ')[2]) 
res1 = dict(Counter(warnList)) 
print "Frequency of messages for components: {} \n".format(res1) 
file.close() 

的每一个建议将不胜感激,

希望为您的帮助或建议,

在此先感谢,

问候

回答

-1

我不太确定我理解正确你的问题,但如果你想按重要性排序日志文件,试试这个:

from __future__ import print_function 

import re 
import operator 
import collections 
import pprint as pp 

importance = { 
    'CRITICAL': 0, 
    'ERROR': 100, 
    'INFO': 200, 
    'DEBUG': 300 
} 

with open('log.log', 'r') as f: 
    data = f.read().splitlines() 

parsed = collections.OrderedDict() 

for line in data: 
    cols = re.split(r'\s+\-\s+', line) 
    parsed[line] = importance[cols[0]] 

for k,v in sorted(parsed.items(), key=operator.itemgetter(1)): 
    print(k)  

输出:

CRITICAL - 2015 Dec 27 20:04:02 - unanticipated_konnor - 44 - amusively sensationally turbanlike rico 
CRITICAL - 2015 Dec 28 08:04:27 - unanticipated_konnor - 1213 - unchastely priorly monophyletic cullen 
ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen 
ERROR - 2015 Dec 28 07:39:36 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia 
ERROR - 2015 Dec 26 17:49:26 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia 
INFO - 2015 Dec 28 02:02:56 - mangiest_ima - 144 - overrealistically decadently unfierce edris 
INFO - 2015 Dec 28 08:12:06 - unfulminating_deacon - 123 - eruptively nonmodally sebacic shavonda 
DEBUG - 2015 Dec 27 16:44:47 - mangiest_ima - 144 - questingly substitutionally uncompensative jen 

如果这不是你想要的,请澄清你需要什么。

如果您只需要第三列:

from __future__ import print_function 

import re 
import operator 
import collections 
import pprint as pp 

importance = { 
    'CRITICAL': 0, 
    'ERROR': 100, 
    'INFO': 200, 
    'DEBUG': 300 
} 

with open('log.log', 'r') as f: 
    data = f.read().splitlines() 

parsed = collections.OrderedDict() 

for line in data: 
    cols = re.split(r'\s+\-\s+', line) 
    parsed[cols[2]] = importance[cols[0]] 

for k,v in sorted(parsed.items(), key=operator.itemgetter(1)): 
    print(k)  

输出:

unanticipated_konnor 
furnacelike_marlene 
unfulminating_deacon 
mangiest_ima 
+0

这正是我需要的。非常感谢您的帮助! – Cerato

相关问题