2015-11-25 47 views
0

我想从其他记录的非匹配字段追加到当前记录的字段。awk:如何将其他记录的非匹配字段附加到同一字段的当前记录?

每个记录的第一个字段是一个组ID。每个人都与不在其团队ID中的人相匹配。所有可能的匹配都需要。

例如,假设names.db

1 Nikola Tesla 
1 Pierre-Simon Laplace 
1 Oliver Heaviside 
2 James Watson 
2 Francis Crick 
3 Kanye West 
4 Michael Faraday 
4 Lord Rayleigh 

变为:

Nikola Tesla -> James Watson 
Nikola Tesla -> Francis Crick 
Nikola Tesla -> Kanye West 
Nikola Tesla -> Michael Faraday 
Nikola Tesla -> Lord Rayleigh 

Pierre-Simon Laplace -> James Watson 
Pierre-Simon Laplace -> Francis Crick 
Pierre-Simon Laplace -> Kanye West 
Pierre-Simon Laplace -> Michael Faraday 
Pierre-Simon Laplace -> Lord Rayleigh 

Oliver Heaviside -> James Watson 
Oliver Heaviside -> Francis Crick 
Oliver Heaviside -> Kanye West 
Oliver Heaviside -> Michael Faraday 
Oliver Heaviside -> Lord Rayleigh 

James Watson -> Nikola Tesla 
James Watson -> Pierre-Simon Laplace 
James Watson -> Oliver Heaviside 
James Watson -> Kanye West 
James Watson -> Michael Faraday 
James Watson -> Lord Rayleigh 

Francis Crick -> Nikola Tesla 
Francis Crick -> Pierre-Simon Laplace 
Francis Crick -> Oliver Heaviside 
Francis Crick -> Kanye West 
Francis Crick -> Michael Faraday 
Francis Crick -> Lord Rayleigh 

Kanye West -> Pierre-Simon Laplace 
Kanye West -> James Watson 
Kanye West -> Oliver Heaviside 
Kanye West -> Francis Crick 
Kanye West -> Michael Faraday 
Kanye West -> Nikola Tesla 
Kanye West -> Lord Rayleigh 

Michael Faraday -> Nikola Tesla 
Michael Faraday -> Pierre-Simon Laplace 
Michael Faraday -> Oliver Heaviside 
Michael Faraday -> James Watson 
Michael Faraday -> Francis Crick 
Michael Faraday -> Kanye West 

Lord Rayleigh -> Nikola Tesla 
Lord Rayleigh -> Pierre-Simon Laplace 
Lord Rayleigh -> Oliver Heaviside 
Lord Rayleigh -> James Watson 
Lord Rayleigh -> Francis Crick 
Lord Rayleigh -> Kanye West 
+0

你可以首先在文件中做一个交叉乘积的行(但那不会是awk)。然后用awk,你可以检查$ 1 == $ 3并打印$ 2 - > $ 4。你能否更具体一点关于你是否只想使用awk? – user3334059

+1

Kanye West ...? – TessellatingHeckler

+0

没有理由它只是awk。根据第一个领域,使用什么工具可以更轻松地完成线的交叉积? – EarthIsHome

回答

1

我知道你的意思。

尝试:

awk '{b=$1;sub($1" ","");a[$0]=b}END{for(i in a){for(j in a){if(i!=j&&a[i]!=a[j])print i" -> "j}print ""}}' file 
+0

这非常接近;什么是'[$ 0]'在做什么?我跑了你的线,输出Pierre-Simon Laplace的7个条目,当它只返回5个条目时(任何不在'1'类别的人,我认为我可以修改这个条目来工作) – EarthIsHome

+0

如果第一个字符在'i '不等于'j'中的第一个字符,然后'print i' - >'j' – EarthIsHome

+1

已更新,add = $ 1。测试OK。 – bian

0

非AWK溶液

$ join -t' ' -j 9 names{,} 
    | sed -r '/([1-9]).*\1/d;s/[1-9]//;s/[1-9]/-->/' 

    Nikola Tesla --> James Watson 
    Nikola Tesla --> Francis Crick 
    Nikola Tesla --> Kanye West 
    Nikola Tesla --> Michael Faraday 
    Nikola Tesla --> Lord Rayleigh 
    Pierre-Simon Laplace --> James Watson 
    Pierre-Simon Laplace --> Francis Crick 
    Pierre-Simon Laplace --> Kanye West 
    Pierre-Simon Laplace --> Michael Faraday 
    Pierre-Simon Laplace --> Lord Rayleigh 
    Oliver Heaviside --> James Watson 
    Oliver Heaviside --> Francis Crick 
    ... 
    Michael Faraday --> Francis Crick 
    Michael Faraday --> Kanye West 
    Lord Rayleigh --> Nikola Tesla 
    Lord Rayleigh --> Pierre-Simon Laplace 
    Lord Rayleigh --> Oliver Heaviside 
    Lord Rayleigh --> James Watson 
    Lord Rayleigh --> Francis Crick 
    Lord Rayleigh --> Kanye West 

说明:创建叉积,具有匹配数字删除线,除去第一个数字,箭头取代第二位数字。当然,这一切都可以用awk完成,但我尝试了其他方式进行更改。

相关问题