分析更复杂的正则表达式

有人给了我一个梦幻般的回答我的问题（如上面的链接描述）但我从来没有设法充分理解它。有人能帮助我吗？我得到的正则表达式是这个”

"(?s)(?=(([^\"]+\"){2})*[^\"]*$)\\s+"

我能理解一些基本的东西，但也有即使彻底谷歌搜索我找不到这个表达式的部分，如之前的S IN的问号开始，或者第二个括号如何在开始时使用问号和公式进行工作，是否可以扩展它并使其能够与其他类型的引号一起使用，例如“”，例如？

任何帮助真的很感谢。

来源

2013-05-27 Fr0stBit

由于它会检查前面是否有偶数，如果输入中的引号没有正确配对，结果将是第一个报价将不会生效，并且配对将从第二个报价开始。 – nhahtdh

可能重复的[使用高级正则表达式在java中的字符串拆分]（http://stackoverflow.com/questions/16655796/string-split-in-java-using-advanced-regex） – Anirudha

"(?s)(?=(([^\"]+\"){2})*[^\"]*$)\\s+"解释;

(?s) # This equals a DOTALL flag in regex, which allows the `.` to match newline characters. As far as I can tell from your regex, it's superfluous. 
(?=  # Start of a lookahead, it checks ahead in the regex, but matches "an empty string"(1) read more about that [here][1] 
(([^\"]+\"){2})* # This group is repeated any amount of times, including none. I will explain the content in more detail. 
    ([^\"]+\") # This is looking for one or more occurrences of a character that is not `"`, followed by a `"`. 
    {2}   # Repeat 2 times. When combined with the previous group, it it looking for 2 occurrences of text followed by a quote. In effect, this means it is looking for an even amount of `"`. 
[^\"]* # Matches any character which is not a double quote sign. This means literally _any_ character, including newline characters without enabling the DOTALL flag 
$  # The lookahead actually inspects until end of string. 
)  # End of lookahead 
\\s+ # Matches one or more whitespace characters, including spaces, tabs and so on

即复杂组在那里被重复两次将匹配空格在此字符串这是不以两种"之间;

text that has a "string in it".

当与String.split一起使用时，将字符串拆分为; [text, that, has, a, "string in it".]

它只会匹配偶数"，因此以下内容将匹配所有空格;

text that nearly has a "string in it.

分割串入[text, that, nearly, has, a, "string, in, it.]

（1）当我说一个捕获组匹配“空字符串”，我的意思是，它实际上抓住什么，它只是展望从点正则表达式，并检查一个条件，实际上没有捕获任何东西。实际的捕获是通过跟随前瞻的\\s+完成的。

来源

2013-05-27 11:05:00 melwil

正是我在找什么！谢谢你的解释！ – Fr0stBit

的(?s)部分是嵌入标志表达，使DOTALL模式，这意味着以下内容：

在DOTALL模式下，表达式。匹配任何字符，包括行结束符。默认情况下，该表达式不匹配行结束符。

的(?=expr)是前瞻表达。这意味着正则表达式看起来与expr相匹配，但在继续进行其余评估之前，它会回到同一点。

在这种情况下，这意味着该正则表达式的任何\\s+ occurence相匹配，其后面是任何偶数的"，然后接着通过非"直到结束（$）。换句话说，它检查前面有偶数的"。

它绝对可以扩展到其他报价。唯一的问题是([^\"]+\"){2}部分，可能不得不使用反向参考（\n）而不是{2}。

来源

2013-05-27 10:56:08 Keppil

也不需要'（？s）'这里.. – Anirudha

这是相当简单..

概念

它在\s+分裂的，每当有甚至数提前"。

例如：

Hello hi "Hi World" 
    ^^ ^
    | | |->will not split here since there are odd number of " 
    ---- 
     | 
     |->split here because there are even number of " ahead

语法

\s匹配\n或\r或space或\t

+是量词，其先前的茶相匹配racter或组1至多次

如果一个后跟BC

(?=ab)a将首先检查AB

[^\"]将匹配任何东西，除了"

(x){2}将匹配x 2倍

a(?=bc)将匹配从目前的位置，然后返回到它的位置。然后匹配一个。 (?=ab)c不匹配C

随着(?s)（单线模式）.将匹配newlines.So，在这种情况下没有必要的(?s)因为没有.

我会用

\s+(?=([^"]*"[^"]*")*[^"]*$)

来源

2013-05-27 10:56:19 Anirudha

分析更复杂的正则表达式

回答

相关问题