用sed删除一行中的重复单词

我要纠正这样的文字：使用SED

there there are are multiple lexical errors in this line line

。我有这么远：

sed 's/\([a-z][a-z]*[ ,\n][ ,\n]*\)\1/\1/g' < file.text

它纠正除了最后加倍的单词之外的所有内容！

there are multiple lexical errors in this line line

请问sed guru请解释为什么上面的内容不涉及到底是什么话？

2012-05-15 benjwy

注： RE - '[，\ n]'sed使用'\ n'作为行分隔符。因此，除非在模式空间中插入'\ n'，否则在向模式空间读入一行后，您将永远不会遇到它们。 – potong

这是因为在最后一种情况下（line），您的正则表达式内存1将有line（行后跟一个空格），并且您正在搜索其重复。由于在最后的line之后没有空间，所以比赛失败。

要解决此问题，请在结尾词line后面添加一个空格。

或者你可以改变正则表达式：

sed -e 's/\b\([a-z]\+\)[ ,\n]\1/\1/g'

2012-05-15 11:58:12 codaddict

回答