egrep找到一行至少有两次相同的单词

如何使用正则表达式来查找至少有两次相同单词的行？egrep找到一行至少有两次相同的单词

我想：

egrep '\w{2,}\1' file

但终端给我的错误：

egrep: invalid backreference number

来源

2016-02-12 Amber

检查我的编辑;应该这样做。 – Will

试试这个：

egrep '(\w{2,}).*\1' file

如果你没有捕获组（ (...)），那么没有任何反向引用。

下面是一个例子：

$ cat file 
this line has the same word twice word 
this line does not 
this is this and that is that 

$ egrep '(\w{2,}).*\1' file 
this line has the same word twice word 
this is this and that is that

来源

2016-02-12 22:44:57 Will

谢谢，但我认为上面的答案更好地解决了这个问题，因为它在双方都添加了\ b字边界。 – Amber

我同意:)没问题。 – Will

有与您当前的正则表达式的几个问题。

使用捕捉字capturing group和backreference它。
添加\bword boundaries用于将词语限制在左侧和右侧。
添加.*匹配any amount之间的any characters之间。

echo "ABC foo ABC bar" | egrep '\b(\w{2,})\b.*\b\1\b'

ABC foo ABC bar

echo "ABC foo ABCD bar" | egrep '\b(\w{2,})\b.*\b\1\b'

false

See demo at regex101。如果需要，使用egrep -o- 仅匹配来提取相关部分。
您可以进一步使用.*?lazy点与grep-P--perl-regexp尽可能少的次数。

来源

2016-02-13 10:23:05

谢谢你的帮助！ – Amber

@黄欢迎您！ –

egrep找到一行至少有两次相同的单词

回答

相关问题