grep正则表达式：搜索任何一组单词

我想按任意顺序搜索大量文件集合，包含或不包含空格或标点符号。因此，举例来说，如果我搜索hello, there, friend，它应该匹配grep正则表达式：搜索任何一组单词

hello there my friend 
friend, hello there 
theretherefriendhello

但不

hello friend 
there there friend

我想不出任何办法做到这一点。甚至有可能使用grep，或者grep的一些变体？

2015-04-01 ewok

它必须是完全'你好'或'helloworld'也行吗？ – fedorqui 2015-04-01 17:44:24

'helloworld'没问题，只要其他词也在那里。我会更新问题以澄清何时回到我的电脑。 – ewok 2015-04-01 17:46:52

是它甚至有可能用grep，或grep的一些变化呢？

你可以使用grep -P ie即Perl模式下面的正则表达式。

^(?=.*hello)(?=.*there)(?=.*friend).*$

查看演示。

2015-04-01 17:41:39 vks

您可以使用sed：

sed -n '/word1/{/word2/{/word3/p;};}' *.txt

2015-04-01 17:39:45 hek2mgl

这适用于GNU sed，但不适用于OSX，FreeBSD等。为了便于使用，在每个大括号（'}'）前加上一个分号（';'）。 – ghoti 2015-04-01 18:00:13

@ghoti非常感谢！很高兴知道！ – hek2mgl 2015-04-01 18:11:44

为此我wouldl使用awk这样的：

awk '/hello/ && /there/ && /friend/' file

此检查当前行中包含所有字符串：hello，there和friend。如果发生这种情况，行打印

为什么？因为那么条件为True，并且当某些内容为True时，awk的默认行为是打印当前行。

2015-04-01 17:48:36 fedorqui

在基本和扩展RE，不使用或于供应商特定版本的扩展如Perl RE，你将需要处理这个使用是这样的：

egrep -lr 'hello.*there.*friend|hello.*friend.*there|there.*hello.*friend|there.*friend.*hello|friend.*hello.*there|friend.*there.*hello' /path/

注意-l选项来告诉你只有文件名和-r告诉grep递归搜索。此解决方案应适用于您可能遇到的几乎所有grep变体。

这在RE方面显然不够优雅，但在使用grep的内置递归搜索方面很方便。如果再困扰你，我会用这个awk或sed相反，如果可以的话，包裹在find：

find /path/ -exec awk '/hello/&&/there/&&/friend/ {r=1} END {exit 1-r}'\; -print

再次，这个输出是一个文件列表，而不是行的列表。您可以调整以适应您的特定要求。

2015-04-01 17:54:16 ghoti

回答