2011-03-21 91 views
1

我试图找到字符串中的某个字(确切的词)相匹配的正则表达式。问题是这个词有什么特殊的字符,如'#'或其他。特殊字符可以是任何UTF-8字符,如(“áéíóúñ#@”),它必须忽略标点符号。正则表达式:匹配的词有特殊字符

我把我正在寻找一些例子:

Searching:#myword 

Sentence: "I like the elephants when they say #myword" <- MATCH 
Sentence: "I like the elephants when they say #mywords" <- NO MATCH 
Sentence: "I like the elephants when they say myword" <-NO MATCH 
Sentence: "I don't like #mywords. its silly" <- NO MATCH 
Sentence: "I like #myword!! It's awesome" <- MATCH 
Sentence: "I like #myword It's awesome" <- MATCH 

PHP示例代码:

$regexp= "#myword"; 
    if (preg_match("/(\w$regexp)/", "I like #myword!! It's awesome")) { 
     echo "YES YES YES"; 
    } else { 
     echo "NO NO NO "; 
    } 

谢谢!

更新:如果我找“myword”这个词有由“W”,而不是其他字符开始。

Sentence: "I like myword!! It's awesome" <- MATCH 
Sentence: "I like #myword It's awesome" <-NO MATCH 
+5

如何在第二和第四产生一个匹配和不匹配? – alex 2011-03-21 13:47:08

+0

它跟一个字母字符和4日就不是 – LDK 2011-03-21 13:51:57

+0

尝试逃避与\ – Yaronius 2011-03-21 13:57:47

回答

2

下面的解决方案是在分别考虑字符和边界时产生的。也可能有一个可行的方法直接使用字边界。

代码:

function search($strings,$search) { 
     $regexp = "/(?:[[:space:]]|^)".$search."(?:[^\w]|$)/i"; 
     foreach ($strings as $string) { 
       echo "Sentence: \"$string\" <- " . 
        (preg_match($regexp,$string) ? "MATCH" : "NO MATCH") ."\n"; 
     } 
} 

$strings = array(
"I like the elephants when they say #myword", 
"I like the elephants when they say #mywords", 
"I like the elephants when they say myword", 
"I don't like #mywords. its silly", 
"I like #myword!! It's awesome", 
"I like #mywOrd It's awesome", 
); 
echo "Example 1:\n"; 
search($strings,"#myword"); 

$strings = array(
"I like myword!! It's awesome", 
"I like #myword It's awesome", 
); 
echo "Example 2:\n"; 
search($strings,"myword"); 

输出:

Example 1: 
Sentence: "I like the elephants when they say #myword" <- MATCH 
Sentence: "I like the elephants when they say #mywords" <- NO MATCH 
Sentence: "I like the elephants when they say myword" <- NO MATCH 
Sentence: "I don't like #mywords. its silly" <- NO MATCH 
Sentence: "I like #myword!! It's awesome" <- MATCH 
Sentence: "I like #mywOrd It's awesome" <- MATCH 
Example 2: 
Sentence: "I like myword!! It's awesome" <- MATCH 
Sentence: "I like #myword It's awesome" <- NO MATCH 
+0

哇!谢谢彼得:) – LDK 2011-03-21 14:31:00

+0

NP,起初我完全困惑,但是当你清理问题和例子时,它可以解决。 :) – 2011-03-21 14:31:58

+0

这很好:)我怎样才能添加大小写不敏感? – LDK 2011-03-21 15:00:54

0

这应该做的伎俩(更换任何你想找到在“myWord”):

^.*#myword[^\w].*$ 

如果匹配成功,然后你的话被发现 - 否则就不是。

+0

这种表达是错误的:('的preg_match():未知的修饰词 '\'' – LDK 2011-03-21 14:14:00

+0

好的作品对我很好(快报 - .NET)所以你可以用字符替换 “\ W”:[AZ] [AZ] [0-9] – 2011-03-21 14:17:44

+0

也许你只是需要跳过斜杠(“\\”而不是“\”,我不知道在PHP中)。 – 2011-03-21 14:18:22

1

你应该寻找myword像这样/\bmyword\b/ wordboundary。
#本身也是wordboundary所以/\b#myword\b/这么想的工作。
一个想法是为了逃避unicode字符\X但这会产生其他问题。

/ #myword\b/ 
+0

此表达式与第三个示例相匹配,但不能为 – LDK 2011-03-21 14:28:36

+0

@LDK true \ X不是一个好主意来逃避unicode特征 – 2011-03-21 14:59:24

+0

+1它的工作原理!我错过了领先的空间,当我尝试它。 – 2011-03-21 15:56:43

相关问题