正则表达式字重复

我需要的sed（仅sed的请），帮助我弄清楚如果某个词出现3次在一个字那么打印该行正则表达式...正则表达式字重复

可以说这是文件：

abc abc gh abc 
abcabc abc 
ab ab cd ab xx ab 
ababab cc ababab 
abab abab cd abab

所以输出：

P1 F1

abc abc gh abc 
ab ab cd ab xx ab 
abab abab cd abab

这是我尝试

sed -n '/\([^ ]\+\)[ ]+\1\1\1/p' $1

它不工作...：/我在做什么错误？

这件事dosent如果字是在开始或没有，他们不需要显示为序列

来源

2015-02-05 nick shmick

它看起来像你有很多功课...你已经问过[如何比较一行中的第一个单词和最后一个单词使用sed？]（http://stackoverflow.com/q/28318579/1983854），您是否使用Avinash的答案来获得更好的尝试？ – fedorqui 2015-02-05 14:15:48

我不明白你在问什么@fedorqui – 2015-02-05 14:17:13

也重复单词不需要第一个单词在一行吗？ – anubhava 2015-02-05 14:17:15

您需要添加.*其间\1

$ sed -n '/\b\([^ ]\+\)\b.*\b\1\b.*\b\1\b/p' file 
abc abc gh abc 
ab ab cd ab xx ab 
abab abab cd abab

我假设你输入只包含空格和单词字符。

来源

2015-02-05 14:15:22

awesome avinash thanks bro :) @Avinash Raj – 2015-02-05 15:11:36

我真的不明白\ b的语法......我的老师没有解释它，它看起来像这样使得thigs更短，你能解释吗？ – 2015-02-05 15:14:58

'\ b'匹配单词字符和非单词字符。 'A-Z'或'a-z'或'0-9'或'_'。除这些字符以外的任何字符都称为非字字符。 – 2015-02-05 15:23:24

我知道它要求sed，但我已经与sed看到所有的系统也有awk，所以这里是一个awk解决方案：

awk -F"[^[:alnum:]]" '{delete a;for (i=1;i<=NF;i++) a[$i]++;for (i in a) if (a[i]>2) {print $0;next}}' file 
abc abc gh abc 
ab ab cd ab xx ab 
abab abab cd abab

这可能是更容易理解比较正则表达式的解决方案。

awk -F"[^[:alnum:]]" # Set field separator to anything other than alpha and numerics characters. 
'{ 
delete a   # Delete array "a" 
for (i=1;i<=NF;i++) # Loop trough one by one word 
    a[$i]++   # Store number of hits of word in array "a" 
for (i in a)  # Loop trough the array "a" 
    if (a[i]>2) { # If one word is found more than two times: 
     print $0 # Print the line 
     next  # Skip to next line, so its not printed double if other word is found three times 
    } 
}' file    # Read the file

来源

2015-02-05 16:38:37 Jotne

正则表达式字重复

回答

相关问题