Unix加入两个以上的文件

我有三个文件，每个文件都有一个ID和一个值。Unix加入两个以上的文件

[email protected]:~/test$ ls 
a.txt b.txt c.txt 
[email protected]:~/test$ cat a.txt 
id1 1 
id2 2 
id3 3 
[email protected]:~/test$ cat b.txt 
id1 4 
id2 5 
id3 6 
[email protected]:~/test$ cat c.txt 
id1 7 
id2 8 
id3 9

我想创建一个类似如下的文件...

id1 1 4 7 
id2 2 5 8 
id3 3 6 9

...最好使用一个命令。

我知道加入和粘贴命令。粘贴每次都会重复的ID列：

[email protected]:~/test$ paste a.txt b.txt c.txt 
id1 1 id1 4 id1 7 
id2 2 id2 5 id2 8 
id3 3 id3 6 id3 9

加入效果很好，但在同一时间只有两个文件：

[email protected]:~/test$ join a.txt b.txt 
id1 1 4 
id2 2 5 
id3 3 6 
[email protected]:~/test$ join a.txt b.txt c.txt 
join: extra operand `c.txt' 
Try `join --help' for more information.

我也知道这种糊可以采取STDIN为一体通过使用“ - ”参数。例如，我可以使用以下命令来复制连接命令：

[email protected]:~/test$ cut -f2 b.txt | paste a.txt - 
id1 1 4 
id2 2 5 
id3 3 6

但我仍然不确定如何修改此以容纳三个文件。

因为我在perl脚本中这样做，我知道我可以做一些事情，比如把它放在一个foreach循环中，就像加入file1 file2> tmp1，加入tmp1 file3> tmp2等。但是这会变得凌乱，我想用一行代码来做到这一点。

来源

2012-02-09 Stephen Turner

我也知道这是一个SQL内部连接的小菜一碟，但我不想先将所有这些加载到数据库中。 – 2012-02-09 14:46:43

join a.txt b.txt|join - c.txt

既然你这样做Perl脚本内应足以

来源

2012-02-09 14:49:22

或者：'加入<（加入a.txt b.txt）c.txt' – jts 2012-02-09 15:31:14

这很好。加入一个b |加入 - c |加入 - 等等。该脚本比<（加入）版本更容易编写脚本，但也可行。谢谢！ – 2012-02-09 18:52:01

，有没有你不这样做在Perl的工作，而不是在外壳产卵任何具体的原因是什么？

喜欢的东西（未测试买者自负！）：

use File::Slurp; # Slurp the files in if they aren't too big 
my @files = qw(a.txt b.txt c.txt); 
my %file_data = map ($_ => [ read_file($_) ]) @files; 
my @id_orders; 
my %data =(); 
my $first_file = 1; 
foreach my $file (@files) { 
    foreach my $line (@{ $file_data{$file} }) { 
     my ($id, $value) = split(/\s+/, $line); 
     push @id_orders, $id if $first_file; 
     $data{$id} ||= []; 
     push @{ $data{$id} }, $value; 
    } 
    $first_file = 0; 
} 
foreach my $id (@id_orders) { 
    print "$d " . join(" ", @{ $data{$id} }) . "\n"; 
}

来源

2012-02-09 14:50:38 DVK

这是我希望能够在命令行上做的事情。我基本上使用perl来粘合其他人（python，C++等）编写的一些其他程序和脚本。 a.txt，b.txt等是从一个python脚本输出的，我现在需要将它们混合在一起，然后将它们导入到统计程序中。 – 2012-02-09 15:03:48

@StephenTurner - 只要你不介意支付产卵壳程序的（不是太大）惩罚/成本，当然。 – DVK 2012-02-09 15:53:08

perl -lanE'$h{$F[0]} .= " $F[1]" END{say $_.$h{$_} foreach keys %h}' *.txt

应该工作，无法测试它，因为我从我的手机接听。如果您在foreach和keys之间输入sort，也可以对输出进行排序。

来源

2012-02-09 16:33:13

pr -m -t -s\ file1.txt file2.txt|gawk '{print $1"\t"$2"\t"$3"\t"$4}'> finalfile.txt

考虑文件1和file2有2列1和2表示从file1和3和4代表从文件2列的列。

您也可以用这种方式打印每个文件中的任何列，并且它会将任意数量的文件作为输入。例如，如果你的file1有5列，那么$ 6将是file2的第一列。

来源

2013-04-16 09:03:04 Chaiwala

Unix加入两个以上的文件

回答

相关问题