查找具有特殊字符

string emailBody = " holla holla testing is for NewFinancial History:\"xyz\" dsd NewFinancial History:\"abc\" NewEBTDI$:\"abc\" dsds "; 

    emailBody = string.Join(" ", Regex.Split(emailBody.Trim(), @"(?:\r\n|\n|\r)")); 
       var keys = Regex.Matches(emailBody, @"\bNew\B(.+?):", RegexOptions.Singleline).OfType<Match>().Select(m => m.Groups[0].Value.Replace(":", "")).Distinct().ToArray(); 
       foreach (string key in keys) 
       { 
        List<string> valueList = new List<string>(); 
        string regex = "" + key + ":" + "\"(?<" + GetCleanKey(key) + ">[^\"]*)\""; 

        var matches = Regex.Matches(emailBody, regex, RegexOptions.Singleline); 
        foreach (Match match in matches) 
        { 
         if (match.Success) 
         { 
          string value = match.Groups[GetCleanKey(key)].Value; 
          if (!valueList.Contains(value.Trim())) 
          { 
           valueList.Add(value.Trim()); 
          } 
         } 
        } 

public string GetCleanKey(string key) 
     { 
      return key.Replace(" ", "").Replace("-", "").Replace("#", "").Replace("$", "").Replace("*", "").Replace("!", "").Replace("@", "") 
       .Replace("%", "").Replace("^", "").Replace("&", "").Replace("(", "").Replace(")", "").Replace("[", "").Replace("]", "").Replace("?", "") 
       .Replace("<", "").Replace(">", "").Replace("'", "").Replace(";", "").Replace("/", "").Replace("\"", "").Replace("+", "").Replace("~", "").Replace("`", "") 
       .Replace("{", "").Replace("}", "").Replace("+", "").Replace("|", ""); 
     }

沿REG EX特定单词在我上面的代码中，我试图将价值得到旁边NewEBTDI$:这是。查找具有特殊字符

当我包含$登录模式时，它不搜索字段名称旁边的值。

如果$被删除，并且只是指定NewEBTDI那么它会搜索值。

我想搜索的值与$符号。

来源

2016-01-21 Savan Patel

请妥善安排您的代码。它不可读。 – 2016-01-21 20:17:59

“$”在正则表达式中有特殊的含义。用\脱出它。但在你的情况下，你将不得不做一个String.Replace（）方法，因为你的正则表达式是生成的。您可能还有其他特殊字符... –

正则表达式在正则表达式中具有特殊意义，但必须按原样搜索的正确方法是逃避它们。你可以用Regex.Escape来做到这一点。在你的情况下，这是$符号，这意味着结束行在正则表达式，如果不逃脱。

string regex = "" + Regex.Escape(key) + ":" + "\"(?<" + Regex.Escape(GetCleanKey(key)) 
       + ">[^\"]*)\"";

或

string regex = String.Format("{0}:\"(?<{1}>[^\"]*)\"", 
          Regex.Escape(key), 
          Regex.Escape(GetCleanKey(key)));

或用VS 2015年，使用字符串插值：

string regex = $"{Regex.Escape(key)}:\"(?<{Regex.Escape(GetCleanKey(key))}>[^\"]*)\"";

（它看起来比现实更好，因为C＃编辑器颜色的字符串部分和嵌入的C＃表达式不同）。

来源

2016-01-21 20:26:44

我对Regex.Escape不了解！ –

谢谢它为我工作！ –

目前尚不清楚最终目标是什么，但模式中的$是一种模式转义，意味着该行的末尾或缓冲区的末尾，具体取决于是否设置了MultiLine。

为什么不只是将:之前的文本捕获到一个命名的捕获？然后提取引述操作价值，如：

var data = "...is for NewFinancial History:\"xyz\" dsd NewFinancial History:\"abc\" NewEBTDI$:\"abc\" dsds"; 

var pattern = @" 
(?<New>New[^:]+)  # Capture all items after `New` that is *not* (`^`) a `:`, one or more. 
:      # actual `:` 
\x22     # actual quote character begin anchor 
(?<InQuotes>[^\x22]+) # text that is not a quote, one or more 
\x22     # actual quote ending anchor 
"; 

// IgnorePatternWhitespace allows us to comment the pattern. Does not affect processing. 
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture) 
    .OfType<Match>() 
    .Select(mt => new 
    { 
     NewText = mt.Groups["New"].Value, 
     Text = mt.Groups["InQuotes"].Value 
    });

结果

注意我用的是十六进制转义\x22，而不是逃避的模式\"更容易与它一起工作的。因为它避免了C＃编译器过早地逃避需要保持完整的模式转义。

来源

2016-01-22 02:57:21 OmegaMan

谢谢它为我工作！ –

查找具有特殊字符

回答

相关问题