我有一个脚本,看起来像这样,我想用它来搜索当前目录,我在,打开,所有目录中,打开与某些RE匹配的所有文件(fastq文件,格式为每四行一行),对这些文件进行一些处理,并将一些结果写入每个目录中的文件。 (注意:实际的脚本比这个做得更多,但我认为我有一个与文件夹迭代相关的结构问题,因为当在一个文件夹中使用简化版本时脚本可以工作,所以我在这里发布简化版本)在Perl中有多个输出文件的多个目录中运行脚本(比较散列键值的问题)
#!user/local/perl
#Created by C. Pells, M. R. Snyder, and N. T. Marshall 2017
#Script trims and merges high throughput sequencing reads from fastq files for a specific primer set
use Cwd;
use warnings;
my $StartTime= localtime;
my $MasterDir = getcwd; #obtains a full path to the current directory
opendir (DIR, $MasterDir);
my @objects = readdir (DIR);
closedir (DIR);
foreach (@objects){
print $_,"\n";
}
my @Dirs =();
foreach my $O (0..$#objects){
my $CurrDir = "";
if ((length ($objects[$O]) < 7) && ($O>1)){ #Checking if the length of the object name is < 7 characters. All samples are 6 or less. removing the first two elements: "." and ".."
$CurrDir = $MasterDir."/".$objects[$O]; #appends directory name to full path
push (@Dirs, $CurrDir);
}
}
foreach (@Dirs){
print $_,"\n";#checks that all directories were read in
}
foreach my $S (0..$#Dirs){
my @files =();
opendir (DIR, $Dirs[$S]) || die "cannot open $Dirs[$S]: $!";
@files = readdir DIR; #reads in all files in a directory
closedir DIR;
my @AbsFiles =();
foreach my $F (0..$#files){
my $AbsFileName = $Dirs[$S]."/".$files[$F]; #appends file name to full path
push (@AbsFiles, $AbsFileName);
}
foreach my $AF (0..$#AbsFiles){
if ($AbsFiles[$AF] =~ /_R2_001\.fastq$/m){ #finds reverse fastq file
my @readbuffer=();
#read in reverse fastq
my %RSeqHash;
my $c = 0;
print "Reading, reversing, complimenting, and trimming reverse fastq file $AbsFiles[$AF]\n";
open (INPUT1, $AbsFiles[$AF]) || die "Can't open file: $!\n";
while (<INPUT1>){
chomp ($_);
push(@readbuffer, $_);
if (@readbuffer == 4) {
$rsn = substr($readbuffer[0], 0, 45); #trims reverse seq name
$cc++ % 10000 == 0 and print "$rsn\n";
$RSeqHash{$rsn} = $readbuffer[1];
@readbuffer =();
}
}
}
}
foreach my $AFx (0..$#AbsFiles){
if ($AbsFiles[$AFx] =~ /_R1_001\.fastq$/m){ #finds forward fastq file
print "Reading forward fastq file $AbsFiles[$AFx]\n";
open (INPUT2, $AbsFiles[$AFx]) || die "Can't open file: $!\n";
my $OutMergeName = $Dirs[$S]."/"."Merged.fasta";
open (OUT, ">", "$OutMergeName");
my $cc=0;
my @readbuffer =();
while (<INPUT2>){
chomp ($_);
push(@readbuffer, $_);
if (@readbuffer == 4) {
my $fsn = substr($readbuffer[0], 0, 45); #trims forward seq name
#$cc++ % 10000 == 0 and print "$fsn\n$readbuffer[1]\n";
if (exists($RSeqHash{$fsn})){ #checks to see if forward seq name is present in reverse seq hash
print "$fsn was found in Reverse Seq Hash\n";
print OUT "$fsn\n$readbuffer[1]\n";
}
else {
$cc++ % 10000 == 0 and print "$fsn not found in Reverse Seq Hash\n";
}
@readbuffer =();
}
}
close INPUT1;
close INPUT2;
close OUT;
}
}
}
my $EndTime= localtime;
print "Script began at\t$StartTime\nCompleted at\t$EndTime\n";
再次,我知道脚本作品,未经遍历文件夹。但是对于这个版本,我只是得到空的输出文件。由于我在此脚本中插入了打印函数,因此我确定Perl无法在INPUT2的散列中找到变量$ fsn作为关键字。我不明白为什么,因为每个文件都在那里,它不工作时,我不遍历文件夹,所以我知道密钥匹配。所以无论是简单的我缺少的东西,还是对我发现的Perl内存的某种限制。任何帮助表示赞赏!
'push my @AbsDirs,...;'因为'my @ AbsDirs'创建了一个新变量,所以没有任何意义。它应该简单地'push @AbsDirs,...;' – ikegami
'$ AbsDirs [$ a]。$ files [$ b]'应该是'“$ AbsDirs [$ a]/$ files [$ b]”' – ikegami
提示:不要使用全局变量。将'open INPUT1,...'替换为'打开我的$ INPUT1,...' – ikegami