有2个日志文件:log A
和log B
。忽略日志文件中时间戳的Pythonic脚本
log A
2015-07-12 08:50:33,904 [Collection-3]INFO app -Executing Scheduled job: System: choppa1
2015-07-12 09:56:45,060 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2015-07-12 10:00:00,001 [Analytics_Worker-1] INFO app - Trigger for job AnBuildAuthorizationJob was fired.
2015-07-12 11:00:00,007 [Analytics_Worker-1] INFO app - Starting the AnBuildAuthorizationJob job.
log B
2014-07-12 09:50:33,904 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2014-07-12 09:56:45,060 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2014-07-12 10:00:00,001 [Analytics_Worker-1] INFO app - Trigger for job AnBuildAuthorizationJob was fired.
2014-07-12 10:00:00,007 [Analytics_Worker-1] INFO app - Starting the AnBuildAuthorizationJob job.
2个日志文件具有相同的内容,但时间戳是不同的。我需要通过忽略时间戳来比较两个文件,即比较两个文件的每一行,即使它们具有不同的时间戳,也不应报告任何差异。我为此编写了以下python脚本:
#!/usr/bin/python
import re
import difflib
program = open("log1.txt", "r")
program_contents = program.readlines()
program.close()
new_contents = []
pat = re.compile("^[^0-9]")
for line in program_contents:
if re.search(pat, line):
new_contents.append(line)
program = open("log2.txt", "r")
program_contents1 = program.readlines()
program.close()
new_contents1 = []
pat = re.compile("^[^0-9]")
for line in program_contents1:
if re.search(pat, line):
new_contents1.append(line)
diff=difflib.ndiff(new_contents,new_contents1)
print(''.join(diff))
是否有更有效的方式来编写上述脚本?而且,只有时间戳记在行首时,上面的脚本才起作用。我想写一个python脚本,即使时间戳在行中的某个位置,它也应该可以工作。任何人都可以请帮助我如何做到这一点?
'我想写一个Python脚本,如果时间戳是这是一个在要求相当飞跃line.'中间的地方,甚至应该工作。在这种情况下我们可以假设什么?时间戳是否有固定的格式?文本中是否有任何可能看起来像时间戳的东西? – nhahtdh