2012-01-31 52 views
3

我有一个看起来像这样的数据:相对记录分隔符在Perl

id:40108689 -- 
chr22_scrambled_bysegments:10762459:F : chr22:17852459:F (1.0), 
id:40108116 -- 
chr22_scrambled_bysegments:25375481:F : chr22_scrambled_bysegments:25375481:F (1.0), 
chr22_scrambled_bysegments:25375481:F : chr22:19380919:F (1.0), 
id:1 -- 
chr22:21133765:F : chr22:21133765:F (0.0), 

所以每个记录由id:[somenumber] --

分开什么是访问数据,使我们可以有一个哈希的方式array:

$VAR = { 'id:40108689' => [' chr22_scrambled_bysegments:10762459:F : chr22:17852459:F (1.0),'], 

     'id:40108116' => ['chr22_scrambled_bysegments:25375481:F :chr22_scrambled_bysegments:25375481:F (1.0)', 
'chr22_scrambled_bysegments:25375481:F : chr22:19380919:F (1.0),' 
     #...etc 
     } 

我试着用记录分隔符来处理这个问题。但不知道如何推广它?

{ 
    local $/ = " --\n"; # How to include variable content id:[number] ? 

    while ($content = <INFILE>) { 
     chomp $content; 
     print "$content\n" if $content; # Skip empty records 
    } 
} 

回答

6
my $result = {}; 
my $last_id; 
while (my $line = <INFILE>) { 
    if ($line =~ /(id:\d+) --/) { 
     $last_id = $1; 
     next; 
    } 
    next unless $last_id; # Just in case the file doesn't start with an id line 

    push @{ $result->{$last_id} }, $line; 
} 

use Data::Dumper; 
print Dumper $result; 

采用正常记录分隔符。

使用$ last_id跟踪遇到的最后一个id行,并在遇到另一个id时将其设置为下一个id。将non-id行推送到数组中,作为最后匹配的id行的散列键。

+0

谢谢。但是,我认为你需要这样一个小修正:'if($ line!〜/ id:\ d + - /)push @ {$ result - > {$ last_id}},$ line; } ' – neversaint 2012-01-31 05:27:49

+1

哎呀,赶上!更正的代码示例。 – 2012-01-31 05:32:58