2012-03-29 93 views
2

符号这是我的字符串|的行为在正则表达式

String s = "asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC"; 

我分裂它作为

String a[] = s.split(s, i); 

输出:ⅰ阵列的= 0

 | | 1 9 0 | | R U E  R A C H E L L E | | S T | | | L E S  C È D R E S | J 7 T 1 J 9 | Q C 

首先两个索引都为空NAD然后每个索引有一个字符。

在i = 1时,输出为整个原始字符串

asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC 

当i = 2,输出是

||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC 

阵列的第一索引是空的,第二个包含来自第一子串|符号

当i = 3,输出是

 ||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC 

第一两个索引都为空和最后一个索引具有相同的子串作为对于i = 2

当i = 4,输出是

 | |190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC 

前两个指标空,下一个包含管道和持续其余

当i = 5时,输出为

 | | 190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC  

第一个两个空的,下两个管道字符和最后剩下的。

为I值的增加,输出

first two indexes empty 
next all indexes except last contains one character each 
last index contains the remaining string 

我的问题是

  1. 为什么不考虑第一管道符号之前的第一个字?
  2. 为什么它让前两个索引为空,除了1以外的每个值?
  3. 模式在这里是相同的字符串,那么在这里匹配什么以及输出如何来?

而另一件事是,如果我与任何其他符号代替符号如@或!或%,输出是

array length is 2 with both indexes has empty strings. this is for i>=2 

对于i = 0

the array length is also 0 

对于i = 1

the array length is 1 containing the whole string. 

它是否服用符号作为一个特殊的正则表达式的符号?

任何帮助明显。

+2

's.split(S,I)' - 你为什么要使用字符串分割本身?这真是奇怪。 – porges 2012-03-29 06:20:58

+0

是的,它的不寻常,但学习正则表达式,我只是试验它,但现在混淆了行为。 – 2012-03-29 06:26:33

回答

4

split方法将正则表达式作为输入参数。现在你的情况下的正则表达式是 asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC,第二个参数i是应用分割操作的次数。这是你的正则表达式

      // Match either the regular expression below (attempting the next alternative only if this one fails) 
    "asadsdas357902" +  // Match the characters “asadsdas357902” literally 
"|" +     // Or match regular expression number 2 below (attempting the next alternative only if this one fails) 
    "|" +     // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match 
         // Or match regular expression number 3 below (attempting the next alternative only if this one fails) 
    "190" +     // Match the characters “190” literally 
"|" +     // Or match regular expression number 4 below (attempting the next alternative only if this one fails) 
    "|" +     // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match 
         // Or match regular expression number 5 below (attempting the next alternative only if this one fails) 
    "RUE\\ RACHELLE" +  // Match the characters “RUE RACHELLE” literally 
"|" +     // Or match regular expression number 6 below (attempting the next alternative only if this one fails) 
    "|" +     // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match 
         // Or match regular expression number 7 below (attempting the next alternative only if this one fails) 
    "ST" +     // Match the characters “ST” literally 
"|" +     // Or match regular expression number 8 below (attempting the next alternative only if this one fails) 
    "|" +     // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match 
         // Or match regular expression number 9 below (attempting the next alternative only if this one fails) 
    "|" +     // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match 
         // Or match regular expression number 10 below (attempting the next alternative only if this one fails) 
    "LES\\ CÈDRES" +   // Match the characters “LES CÈDRES” literally 
"|" +     // Or match regular expression number 11 below (attempting the next alternative only if this one fails) 
    "J7T1J9" +    // Match the characters “J7T1J9” literally 
"|" +     // Or match regular expression number 12 below (the entire match attempt fails if this one fails to match) 
    "QC"      // Match the characters “QC” literally 

的解释所以,你的正则表达式实际上等同于asadsdas357902|的方式,因为这涉及以后它从来没有测试的正则表达式。看到这里String#split

split方法的文档此代码会给你同样的输出

private static void splitWithPipe() { 
    String s = "asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC"; 
    for (int i = 0; i < 10; i++) { 
     String a[] = s.split("asadsdas357902|", i); 
     System.out.println(Arrays.toString(a)); 
    } 
} 
0

在正则表达式中,不需要双重|符号 - 这只会让事情混淆不清。您不妨考虑访问a site with regular expression tutorials并使用网站regular expression tester

+0

如何在正则表达式中使用“double”|'“? – stema 2012-03-29 07:11:20

+0

@stema - 我的错误 - 请参阅编辑。打了电话一半打了一个电话,失去了我的思路 – 2012-03-29 10:26:19

2

|确实是一个正则表达式中的特殊字符。意思是“无论是我左边的东西,还是我右边的东西”,所以ab|cd匹配abcd。这可以通过括号进一步限制。

如果你想在|做一个正则表达式拆分,那么你需要的正则表达式\|,这在Java需要一个字符串写为"\\|"