0
下面的方法分析来自nginx的日志线捕捉URI可选参数:如何通过正则表达式
def test_parse_line2(self):
groups = ['ip', 'timestamp', 'offset', 'command', 'path', 'protocol', 'status', 'bytes', 'client']
line = '1.2.3.4 - - [22/Oct/2015:12:01:49 -0500] "GET /mypath/?param1=value1¶m2=value2 HTTP/1.1" 200 51 "-" "SomeRandomClient"'
pattern = r'(?P<ip>[^ ]+) - - \[(?P<timestamp>[^ ]+) (?P<offset>[-\+][0-9]{4})] "' +\
r'(?P<command>[A-Z]+) /(?P<path>[^ ]+) (?P<protocol>[^"]+)" (?P<status>[0-9]+) (?P<bytes>[0-9]+) (?:[^ ]+)'+\
r' "(?P<client>[^"]+)'
match = re.search(pattern, line)
if match:
for group_name in groups:
print(group_name, match.group(group_name))
是否有修改它,让我单独捕获的必需路径,mypath
的方式,和可选参数,param1=value1¶m2=value2
?
这一点,如果没有参数,我换成\后 – AlexC
曾为不起作用?用\ ?? – AlexC