组合输出我有一个在下面的格式使用awk
创建的文件:通过现场用awk
文件
chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 230 bases with an average of 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8
chr2:211460199-211460318 CPS1-1200|gc=41.2 119 bases with an average of 105.6
我所试图做的是结合匹配所有$2
一排接一排地脱掉-
。文件中的每一行都会有一个匹配项,尽管这些在示例中没有显示。谢谢 :)。
所需的输出
chr2:211471445-211471675 CPS1|gc=48.3 230 bases with an average of 264.7
chr2:211460199-211460318 CPS1|gc=41.2 119 bases with an average of 105.6
chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8
我想:
AWK
awk '{k=$1 FS $2; a[k]+=split[$2] "-"; c[k]++}
END{for(k in a)
{split(k,ks,FS);
print ks[1],c[k],ks[2],a[k]/c[k]}}' file > output.txt
如果“每一行都有匹配”,为什么不直接在第二个字段中去掉“ - [digits]”。 –