查找字符串中只出现一次的单词

如何查找bash中字符串中没有重复的单词？我想知道是否有一个“本地”bash这样做的方式，或者如果我需要使用另一个命令行实用程序（如awk，sed，grep，...）。例如，var1="thrice once twice twice thrice";。我需要一些能将“一次”这个词分开的词，因为它只出现一次（即没有重复）。查找字符串中只出现一次的单词

来源

2014-02-08 BenjiWiebe

我会说没有，有没有简单而优雅的方式。（虽然我准备证明是错误的一半，但这个网站很棒。） – tripleee

定义了“拆分”。 –

@KarolyHorvath'var1'将被闲置。我只需要以某种方式拥有独特的单词，所以我可以在脚本的其余部分使用它。 – BenjiWiebe

您可以通过空格分割后的字符串使用sort，uniq：

tr ' ' '\n' <<< "$var1" | sort | uniq -u

这将产生once您的输入。

（如果输入包含标点符号，你可能会想，以避免意外的结果任何事情之前将其删除。）

来源

2014-02-08 18:43:06 devnull

这正是我想出的:) –

这将与用户名列表一起使用，所以不会有标点符号。 – BenjiWiebe

完美无缺！谢谢！ – BenjiWiebe

@ devnull的回答是更好的选择（无论是简单性和可能的表现），但如果你正在寻找一个庆典，唯一的解决办法：

注意事项：

用途关联数组，仅在bash 4或更高版本中可用：
在输入单词列表中使用文字*将不起作用（但其他类似glob的字符串也可以）。
正确处理多行输入和输入多个空白字符。词之间。

# Define the input word list. 
# Bonus: multi-line input with multiple inter-word spaces. 
var1=$'thrice once twice twice thrice\ntwice again' 

# Declare associative array. 
declare -A wordCounts 

# Read all words and count the occurrence of each. 
while read -r w; do 
    [[ -n $w ]] && ((wordCounts[$w]+=1)) 
done <<<"${var1// /$'\n'}" # split input list into lines for easy parsing 

# Output result. 
# Note that the output list will NOT automatically be sorted, because the keys of an 
# associative array are not 'naturally sorted'; hence piping to `sort`. 
echo "Words that only occur once in '$var1':" 
echo "---" 
for w in "${!wordCounts[@]}"; do 
    ((wordCounts[$w] == 1)) && echo "$w" 
done | sort 

# Expected output: 
# again 
# once

来源

2014-02-08 19:03:42 mklement0

有趣。尽管如此，它不完全是我所说的*高雅的* ... – BenjiWiebe

同意 - 坚持@ devnull的解决方案，并把它当作bash的关联数组的示范。 – mklement0

只是为了好玩，AWK：

awk '{ 
    for (i=1; i<=NF; i++) c[$i]++ 
    for (word in c) if (c[word]==1) print word 
}' <<< "$var1"

once

来源

2014-02-08 21:50:44

+1;容易推广到处理_multi-line_ input：'awk'{for（i = 1; i <= NF; i ++）c [$ i] ++; } END {for（word in c）if（c [word] == 1）print word}'<<<“$ var1”'。有一点需要注意：输出单词列表不会被排序。 – mklement0

查找字符串中只出现一次的单词

回答

相关问题