如何使正则表达式允许可选的前缀和后缀提取

正如所描述的标题，正则表达式应提供给定字符串，字符串前缀（可选）和字符串后缀（可选）的提取信息的目的如何使正则表达式允许可选的前缀和后缀提取

这样

prefix_group_1_suffix回报group_1时的前缀是 'prefix_' 和后缀是_suffix

prefix_group_1回报group_1时的前缀是 'prefix_' 和后缀null < - 我的代码无法处理这种情况

group_1_suffix回报group_1时前缀为“空”和后缀是_suffix

group_1回报group_1时前缀为“空”和后缀是null < - 我的代码不能处理这种情况

这里是我的代码，但是我发现它不工作时

String itemName = ""; 
    String prefix = "TEST_"; 
    String suffix = ""; 
    String itemString = prefix + "item_1" + suffix; 
    String prefix_quote = "".equals(prefix) ? "" : Pattern.quote(prefix); 
    String suffix_quote = "".equals(suffix) ? "" : Pattern.quote(suffix); 
    String regex = prefix_quote + "(.*?)" + suffix_quote; 
    Pattern pattern = Pattern.compile(regex); 
    Matcher matcher = pattern.matcher(itemString); 
    while (matcher.find()) { 
     itemName = matcher.item(1); 
     break; 
    } 
    System.out.println("itemString '"+itemString+"'"); 
    System.out.println("Prefix quote '"+prefix_quote+"'"); 
    System.out.println("Suffix quote '"+suffix_quote+"'"); 
    System.out.println("regex '"+regex+"'"); 
    System.out.println("itemName is '"+itemName+"'");

，这里是输出

itemString 'TEST_item_1' 
Prefix quote '\QTEST_\E' 
Suffix quote '' 
regex '\QTEST_\E(.*?)' 
itemName is ''

但上面的代码工作以及与其他两个条件

来源

2017-09-14 Dreamer

你看的正则表达式的任何教程？ – JoelFan

那么考虑两种情况，prefix_group_1和group_1_suffix。我相信前缀和后缀可以是任何文本值。然后这两种模式都与A_B_C相同。系统如何说如果A是前缀，B_C是缺少后缀的数据，或者C是后缀为A_B的后缀，并且缺少前缀。该系统需要更多信息。另外，如果你的文本是用下划线格式化的，那么为什么你需要使用正则表达式的原因是什么？为什么不把它解析为标记。 – Gautam

你为什么要重复比赛？我理解你的问题的方式是，每个字符串只能有（最多）.one匹配。 –

为什么你的代码失败就在于懒惰量词.*?原因。最重要的是尽可能少的匹配，最好是空字符串，所以它就是这样做的。因此，您需要将正则表达式锚定到字符串的开始/结尾以及可能的前缀/后缀。

对于这一点，你可以使用lookaround assertions：

String prefix = "TEST_"; 
String suffix = ""; 
String itemString = prefix + "item_1" + suffix; 
String prefix_quote = "".equals(prefix) ? "^" : Pattern.quote(prefix); 
String suffix_quote = "".equals(suffix) ? "$" : Pattern.quote(suffix); 
String regex = "(?<=^|" + prefix_quote + ")(.*?)(?=$|" + suffix_quote + ")"; 
Pattern pattern = Pattern.compile(regex); 
Matcher matcher = pattern.matcher(itemString);

这将导致在正则表达式

(?<=^|TEST_)item_1(?=$|$)

说明：

(?<= # Assert that it's possible to match before the current position 
^  # either the start of the string 
|  # or 
TEST_ # the prefix 
)  # End of lookbehind 
item_1 # Match "item_1" 
(?=$|$) # Assert that it's possible to match after the current position 
     # either the end of the string or the suffix (which is replaced 
     # by the end of the string if empty. Of course that could be optimized 
     # when constructing the regex, this is just a quick-and-dirty solution).

来源

2017-09-14 05:26:13

如果'^'和'$'锚点是合适的，这意味着正则表达式用于*匹配*，而不是*找到*，所以放下锚点并简单地调用'matches（）'而不是'find（） '。 – Andreas

这是行不通的，因为lookaround断言实际上并不匹配文本; '.matches（）'要求整个字符串由正则表达式匹配。请注意，锚点位于lookaround断言的内部，而不是在正则表达式的开始/结束处。我确实认为OP需要确认他的目标确实匹配，而不是找到。 –

-1

，如果你有，你想找到一个特定的字符串，那么你可以使用任何字符串匹配算法：

1.“boyer moore horspool”算法是kmp sring匹配算法的一个更好的版本。你可以试着找到你想要搜索的字符串的位置。 2.你也可以看看模糊字符串匹配的“Levenshtein距离”。

3.i猜测在字符串中找到一个子字符串将是一个更好的选择。

码有每一个地方....

来源

2017-09-14 05:13:27

这似乎没有回答这个问题。 –

如何使正则表达式允许可选的前缀和后缀提取

回答

相关问题