你可以在你的正则表达式中使用一个简单的交替进行拆分:
my @parts = split(/\s*,\s*|\s+and\s+/, $string1);
例如:
$ perl -we 'my $string1 = "Joe Smith, Jason Jones, Jane Doe and Jack Jones";print join("\n",split(/\s*,\s*|\s+and\s+/, $string1)),"\n"'
Joe Smith
Jason Jones
Jane Doe
Jack Jones
$ perl -we 'my $string2 = "Jane Doe and Joe Smith";print join("\n",split(/\s*,\s*|\s+and\s+/, $string2)),"\n"'
Jane Doe
Joe Smith
如果你还必须处理牛津大学ma(即“这个,那个,和其他的东西”),那么你可以使用
my @parts = split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $string1);
例如:
$ perl -we 'my $s = "Joe Smith, Jason Jones, Jane Doe, and Jack Jones";print join("\n",split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $s)),"\n"'
Joe Smith
Jason Jones
Jane Doe
Jack Jones
$ perl -we 'my $s = "Joe Smith, Jason Jones, Jane Doe and Jack Jones";print join("\n",split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $s)),"\n"'
Joe Smith
Jason Jones
Jane Doe
Jack Jones
$ perl -we 'my $s = "Joe Smith and Jack Jones";print join("\n",split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $s)),"\n"'
Joe Smith
Jack Jones
感谢stackoverflowuser2010的注意这种情况。
你会希望\s*,\s*and\s+
之初保持交替的其他分支从分裂的逗号或“和”第一,this order appears to be guaranteed as well:
替代品都试过了,从左至右,所以找到整个表达式匹配的第一个替代方案,就是选择的方法。
如何处理“Joe Smith,MD and Mary and Joe Smith”这样的“姓名”? – tadmc
请注意,您不使用'@data [1]'而是'$ data [1]'。既然你只使用一个元素,它是一个标量。 –