Ruby字符串分裂为多个字符

我有一个字符串，比如说“Hello_World I am Learning，Ruby”。我想将这个字符串分成不同的单词，最好的方法是什么？Ruby字符串分裂为多个字符

谢谢！ C.

2011-10-11 curious

您可以使用String.split和正则表达式模式作为参数。像这样：

"Hello_World I am Learning,Ruby".split /[ _,.!?]/ 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

来源

2011-10-11 09:50:14 zacsek

ruby-1.9.2-p290 :022 > str = "Hello_World I am Learning,Ruby" 
ruby-1.9.2-p290 :023 > str.split(/\s|,|_/) 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

来源

2011-10-11 09:53:45 Jin

虽然上面的例子中工作，我想将字符串分割的话拆就不会被认为是任何一种文字的一部分字符的时候它可能会更好。要做到这一点，我这样做：

str = "Hello_World I am Learning,Ruby" 
str.split(/[^a-zA-Z]/).reject(&:empty?).compact

本声明如下：

拆分由不在字母字符的字符串
然后拒绝任何为空字符串
，并移除阵列

然后将处理的话大部分组合的所有空值。上面的例子要求你列出你想匹配的所有字符。指定不认为是单词的一部分的字符要容易得多。

来源

2011-10-11 10:14:52 BlueFish

String#Scan似乎是一个合适的方法完成这个任务

irb(main):018:0> "Hello_World I am Learning,Ruby".scan(/[a-z]+/i) 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

，或者您可以使用内置的匹配\w

irb(main):020:0> "Hello_World I am Learning,Ruby".scan(/\w+/) 
=> ["Hello_World", "I", "am", "Learning", "Ruby"]

来源

2011-10-11 10:34:45 Bohdan

你可以使用\ W任何非单词字符：

"Hello_World I am Learning,Ruby".split /[\W_]/ 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"] 

"Hello_World I am Learning, Ruby".split /[\W_]+/ 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

来源

2011-10-11 10:45:57 Samnang

只是为了好玩，1.9的Unicode识别版本（或1.8与Oniguruma）：

>> "This_µstring has words.and thing's".split(/[^\p{Word}']|\p{Connector_Punctuation}/) 
=> ["This", "µstring", "has", "words", "and", "thing's"]

或许：

>> "This_µstring has words.and thing's".split(/[^\p{Word}']|_/) 
=> ["This", "µstring", "has", "words", "and", "thing's"]

真正的问题是确定哪些字符序列构成在这种情况下一个 “字”。您可能想要查看Oniguruma docs以了解支持的字符属性，Wikipedia has some notes on the properties。

来源

2011-10-11 16:40:06

Ruby字符串分裂为多个字符

回答

相关问题