2017-10-13 95 views
-1

我有一个独特的名字和内部的每个文件夹的文件夹40是一个名为的summary.txt看起来像这样:提取最后一个元素,并保留文件夹名称

HISAT2 summary stats: 
Total reads: 36590175 
    Aligned 0 time: 1238197 (3.38%) 
    Aligned 1 time: 33866701 (92.56%) 
    Aligned >1 times: 1485277 (4.06%) 
Overall alignment rate: 96.62% 

我想创建一个新的.txt文件有文件夹名称的列和某列的“96.62%”,使得最终的结果是这样的:

Folder name  alignment rate 
Sample1    96.62% 
Sample2    94.53% 
...     ... 
SampleN    96.22% 

有没有办法做到这一点使用命令行。也许awk?任何帮助,将不胜感激。

哈利

回答

0

使用找到命令

$ echo -e "Folder name\talignment rate" > output.txt 

$ find . -iname "summary.txt" -exec awk 'END{ match(FILENAME,/\/(\w+)\//,a); print a[1]"\t\t"$4}' {} \; > output.txt 

输出:

Folder name  alignment rate 
dir1   96.62% 
dir2   96.62% 
0

awk中溶液:

步骤之前(在result.txt设定标题行):

$ cat > result.txt 
Folder name  alignment rate 

awk '/^Overall/{ 
     printf "%-20s%s\n",substr(FILENAME,0,index(FILENAME, "/")-1), $NF >> "result.txt" 
    }' Sample*/summary.txt 

Ť他result.txt内容应该是这样的:

Folder name  alignment rate 
Sample1    96.62% 
Sample2    94.53% 
... 
0

一个简单的脚本awk

$ awk -F': ' 'BEGIN { print "folder", "rate" } 
       /Overall/ { sub("/.*","",FILENAME); print FILENAME, $2 }' */summary.txt 
folder rate 
a 96.62% 
b 91.63% 
c 93.22% 
相关问题