2016-08-22 119 views
1

这是为了检查我的实用方法,如果一个replacement string是有效的:检查替换字符串是有效

public static boolean isValidReplacementString(String regex, String replacement) { 
    try { 
     "".replaceFirst(regex, replacement); 
     return true; 
    } catch (IllegalArgumentException | NullPointerException e) { 
     return false; 
    } 
} 

我想执行的真正替代,因为获取源字符串(S)前检查,这是昂贵(I/O)。

我觉得这个解决方案很拗口。标准库中是否存在缺少的方法?


编辑: As pointed out by sln,这不,即使没有找到匹配的工作。


编辑: Following shmosel's answer,我想出了这个 “解决方案”:

private static boolean isLower(char c) { 
    return c >= 'a' && c <= 'z'; 
} 

private static boolean isUpper(char c) { 
    return c >= 'A' && c <= 'Z'; 
} 

private static boolean isDigit(char c) { 
    return isDigit(c - '0'); 
} 

private static boolean isDigit(int c) { 
    return c >= 0 && c <= 9; 
} 

@SuppressWarnings("unchecked") 
public static void checkRegexAndReplacement(String regex, String replacement) { 
    Pattern parentPattern = Pattern.compile(regex); 
    Map<String, Integer> namedGroups; 
    int capturingGroupCount; 

    try { 
     Field namedGroupsField = Pattern.class.getDeclaredField("namedGroups"); 
     namedGroupsField.setAccessible(true); 
     namedGroups = (Map<String, Integer>) namedGroupsField.get(parentPattern); 
     Field capturingGroupCountField = Pattern.class.getDeclaredField("capturingGroupCount"); 
     capturingGroupCountField.setAccessible(true); 
     capturingGroupCount = capturingGroupCountField.getInt(parentPattern); 
    } catch (NoSuchFieldException | IllegalAccessException e) { 
     throw new RuntimeException("That's what you get for using reflection!", e); 
    } 

    int groupCount = capturingGroupCount - 1; 

    // Process substitution string to replace group references with groups 
    int cursor = 0; 

    while (cursor < replacement.length()) { 
     char nextChar = replacement.charAt(cursor); 
     if (nextChar == '\\') { 
      cursor++; 
      if (cursor == replacement.length()) 
       throw new IllegalArgumentException(
         "character to be escaped is missing"); 
      nextChar = replacement.charAt(cursor); 
      cursor++; 
     } else if (nextChar == '$') { 
      // Skip past $ 
      cursor++; 
      // Throw IAE if this "$" is the last character in replacement 
      if (cursor == replacement.length()) 
       throw new IllegalArgumentException(
         "Illegal group reference: group index is missing"); 
      nextChar = replacement.charAt(cursor); 
      int refNum = -1; 
      if (nextChar == '{') { 
       cursor++; 
       StringBuilder gsb = new StringBuilder(); 
       while (cursor < replacement.length()) { 
        nextChar = replacement.charAt(cursor); 
        if (isLower(nextChar) || 
          isUpper(nextChar) || 
          isDigit(nextChar)) { 
         gsb.append(nextChar); 
         cursor++; 
        } else { 
         break; 
        } 
       } 
       if (gsb.length() == 0) 
        throw new IllegalArgumentException(
          "named capturing group has 0 length name"); 
       if (nextChar != '}') 
        throw new IllegalArgumentException(
          "named capturing group is missing trailing '}'"); 
       String gname = gsb.toString(); 
       if (isDigit(gname.charAt(0))) 
        throw new IllegalArgumentException(
          "capturing group name {" + gname + 
            "} starts with digit character"); 
       if (namedGroups == null || !namedGroups.containsKey(gname)) 
        throw new IllegalArgumentException(
          "No group with name {" + gname + "}"); 
       refNum = namedGroups.get(gname); 
       cursor++; 
      } else { 
       // The first number is always a group 
       refNum = (int)nextChar - '0'; 
       if (!isDigit(refNum)) 
        throw new IllegalArgumentException(
          "Illegal group reference"); 
       cursor++; 
       // Capture the largest legal group string 
       boolean done = false; 
       while (!done) { 
        if (cursor >= replacement.length()) { 
         break; 
        } 
        int nextDigit = replacement.charAt(cursor) - '0'; 
        if (!isDigit(nextDigit)) { // not a number 
         break; 
        } 
        int newRefNum = (refNum * 10) + nextDigit; 
        if (groupCount < newRefNum) { 
         done = true; 
        } else { 
         refNum = newRefNum; 
         cursor++; 
        } 
       } 
      } 
      if (refNum < 0 || refNum > groupCount) { 
       throw new IndexOutOfBoundsException("No group " + refNum); 
      } 
     } else { 
      cursor++; 
     } 
    } 
} 

如果此方法抛出,无论是正则表达式或替换字符串是无效的。

这比replaceAllreplaceFirst更严格,因为如果找不到匹配,这些方法将不会调用appendReplacement,因此“缺少”无效的组引用。

+1

我不确定引擎是否会检查替换字符串,如果没有匹配,我可能是错的。也就是说,替换时的一些错误可能是非正则表达式中没有定义的捕获组反向引用。 – sln

+0

替换前使用apache StringUtils.isNotNull方法检查null。 – amitmah

+0

@sln你说得对。 'isValidReplacementString(“test”,“$”)'因为没有找到匹配而返回'true'。所以我的方法甚至不能正常工作。 – xehpuk

回答

1

我想说你最好的选择是复制Matcher.appendReplacement()中实现的过程,删除任何有关源字符串或结果字符串的逻辑。这不可避免地意味着你将无法进行某些验证,例如验证组名和索引,但是你应该能够应用其中的大部分验证。

+0

我已将“appendReplacement”的主体并使其可运行(在问题中编辑)。与你的期望相反,现在更严格,如果没有找到匹配,也会抛出。这可能是最好的。 – xehpuk

+0

@xehpuk你说得对,我把源字符串与正则表达式混淆了。 – shmosel

相关问题