2017-07-26 84 views
2

我想从日志文件中提取一些模式,但我无法正确打印它们。日志串Perl正则表达式 - 打印匹配的条件正则表达式

实例:

1) sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666 

2) sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616 

我想提取3两件事:

A = “FPJ.INV_DOM_16_PRD” B = “47269” C = 9616或2644666(如果行 已endID所然后C = 2644666否则它是9616)

日志行可以是类型1或2。我能够提取甲乙但我坚持C作为我需要的有条件的声明,我无法正确提取它。我粘贴我的代码:

my $string='/sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666'; 

if ($string =~ /sequence_history\/buckets\/(.*)/){ 
    my $line = $1; 
    print "$line\n"; 
    if($line =~ /(FPJ.*PRD)\.(\d*)\./){ 
     my $topic_type_string = $1; 
     my $topic_id = $2; 
     print "$1\n$2\n"; 

    } 
if($string =~ /(?(?=endid=)\d*$)/){ 
    # how to print match pattern here? 
    print "match\n"; 
} 

在此先感谢!

+0

就像这样https://regex101.com/r/T6QDMh/1/? – revo

回答

2

这将做的工作:

use Modern::Perl; 
use Data::Dumper; 

my $re = qr/(FPJ.+?PRD)\.(\d+)\..*?(\d+)$/; 
while(<DATA>) { 
    chomp; 
    my (@l) = $_ =~ /$re/g; 
    say Dumper\@l; 
} 

__DATA__ 
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666 
sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616 

输出:

$VAR1 = [ 
      'FPJ.INV_DOM_16_PRD', 
      '47269', 
      '2644666' 
     ]; 

$VAR1 = [ 
      'FPJ.INV_DOM_16_PRD', 
      '41987', 
      '9616' 
     ]; 

说明:

(  : start group 1 
    FPJ : literally FPJ 
    .+? : 1 or more any character but newline, not greedy 
    PRD : literally PRD 
)  : end group 1 
\.  : a dot 
(  : start group 2 
    \d+ : 1 or more digit 
)  : end group 2 
\.  : a dot 
.*?  : 0 or more any character not greedy 
(  : start group 3 
    \d+ : 1 or more digit 
)  : end group 3 
$  : end of string 
+0

谢谢。完美的作品。另外,感谢您对正则表达式的解释。 –

+0

@PushpinderSingh:不客气,很高兴帮助。随意标记答案为接受,请参阅:https://stackoverflow.com/help/someone-answers – Toto

0

如果你想获取日志文件中某些条目,那么你可以使用文件处理es在Perl中。在下面的代码中,我试图从名为test.log的日志文件中获取条目

日志的条目如下。

sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666 
sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616 
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.69886?startid=2644000&endid=26765849 
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.24465?startid=2644000&endid=836783741 

下面是获取所需数据的perl脚本。

#!/usr/bin/perl 

use strict; 
use warnings; 

open (FH, "test.log") || die "Not able to open test.log $!"; 

my ($a,$b,$c); 
while (my $line=<FH>) 
{ 

     if ($line =~ /sequence_history\/buckets\/.*endid=(\d*)/) 
     { 
       $c= $1; 
       if ($line =~ /(FPJ.*PRD)\.(\d*)\.(\d*)\?/) 
       { 
         $a=$1; 
         $b=$2; 
       } 
     } 
     else 
     { 
       if ($line =~ /sequence_history\/buckets\/(FPJ.*PRD)\.(\d*)\.(\d*)/) 
       { 
         $a=$1; 
         $b=$2; 
         $c=$3; 
       } 
     } 

print "\n \$a=$a\n \$b=$b\n \$c=$c \n"; 
} 

输出:

$a=FPJ.INV_DOM_16_PRD 
$b=47269 
$c=2644666 

$a=FPJ.INV_DOM_16_PRD 
$b=41987 
$c=9616 

$a=FPJ.INV_DOM_16_PRD 
$b=47269 
$c=26765849 

$a=FPJ.INV_DOM_16_PRD 
$b=47269 
$c=836783741 

您可以通过日志文件名称替换“test.log中”使用上面的代码要如下图所示,以获取数据(与它的路径一起)。

open (FH, "/path/to/log/file/test.log") || die "Not able to open test.log $!";