2011-10-26 121 views
2

我正在使用C#来查找可能存在或可能不存在于博客文章中的短语。我需要捕捉包含目标短语的整个句子。用正确的短语捕获句子的正则表达式

我想过使用string.contains方法,但是当我想要的是目标短语及其包含的句子时,它会返回整个博客文章。

例子:

I dont want this sentence. I also don't want this setence. But I do want this sentence. 

所以这里的目标短语就是:“我愿意”和正则表达式应该返回整个句子含有“但我想这句话。”

谢谢。 亚伦

回答

2

此正则表达式:

resultString = Regex.Match(subjectString, @"(?<=^|\.)[^.]*?(?=\bI do\b).*(\.|$)").Value; 

当适用于您的输入:

I dont want this sentence. I also don't want this setence. But I do want this sentence. 

返回:

But I do want this sentence. 

打开RegexOptions.Singleline如果你担心多行。

+0

thx。那非常好用 – Aaron

1

我不知道正则表达式的,但你可以使用Split功能的组合和Contains功能和写是这样的:

string DoesBlogContainSentence(string blog, string target) 
{ 
    string[] blogSentences = blog.Split(new char[] {'.'}); 

    foreach(string sentence in blogSentences) 
    { 
     if(sentence.Contains(target)) 
     { 
      return sentence; 
     } 
    } 

    return string.Empty; 
} 
+0

拆分'。'单独不一定只会返回句子。例如,如果你有一个十进制数,这将被拆分。 – wdavo

1

你可能分裂的博客文章成句子,然后搜索每个句子的目标短语。

E.g.

string data = "I dont want this sentence. I also don't want this setence. But I do want this sentence."; 
    string targetPhrase = "I do"; 

    string[] sentences = Regex.Split(data, "\\.\\s"); 

    foreach (string sentence in sentences) 
    { 
    if (Regex.IsMatch(sentence, "\\s" + targetPhrase + "\\s")) 
    { 
     //..... 
    } 
    }