蟒蛇分裂

我有以下字符串：蟒蛇分裂

（一些文本）或（（其它文本）和（一些文字））和（仍更多的文字）

我想一个python正则表达式，将其分解成

['(some text)', '((other text) and (some more text))', '(still more text)']

我已经试过，但它不工作：

haystack = "(some text) or ((other text) and (some more text)) and (still more text)" 
re.split('(or|and)(?![^(]*.\))', haystack) # no worky

任何帮助表示赞赏。

来源

2017-08-01 Sid Kwakkel

正则表达式不能很好地处理任意嵌套的内容。除了您向我们展示的示例之外，可能会有更多层嵌套括号。对于这种情况，使用解析器可能会比正则表达式更进一步。 –

这可能有所帮助：https://stackoverflow.com/questions/26633452/how-to-split-by-commas-that-are-not-within-parentheses –

这可能也是有用的：https：//stackoverflow.com/questions/4284991/parsing-nested-parentheses-in-python-grab-content-by-level – perigon

我会用re.findall代替re.split。而且注意，这只会工作高达深度的括号2

>>> import re 
>>> s = '(some text) or ((other text) and (some more text)) and (still more text)' 
>>> re.findall(r'\((?:\((?:\([^()]*\)|[^()]*)*\)|[^()])*\)', s) 
['(some text)', '((other text) and (some more text))', '(still more text)'] 
>>>

来源

2017-08-01 05:51:28

是的。我添加了一个注释.. –

我试图简化我的字符串，并且它反弹。您的解决方案不适用于我的真实字符串... （substringof（'needle'，name））或（（role eq'needle'）and（substringof（'needle'，email）））或（job eq'needle '）或（office eq'针'） –

@ user1571934请提供确切的字符串.. –

你可以试试这个 re.split（ '[A-F] +'， '0a3B9'，旗帜= re.IGNORECASE）

来源

2017-08-01 05:47:42

该解决方案适用于任意嵌套的括号，其中一个正则表达式不能（s是原始字符串）：

from pyparsing import nestedExpr 
def lst_to_parens(elt): 
    if isinstance(elt,list): 
     return '(' + ' '.join(lst_to_parens(e) for e in elt) + ')' 
    else: 
     return elt 

split = nestedExpr('(',')').parseString('(' + s + ')').asList() 
split_lists = [elt for elt in split[0] if isinstance(elt,list)] 
print ([lst_to_parens(elt) for elt in split_lists])

输出：

['(some text)', '((other text) and (some more text))', '(still more text)']

对于OP真实的测试案例：

s = "(substringof('needle',name)) or ((role eq 'needle') and (substringof('needle',email))) or (job eq 'needle') or (office eq 'needle')"

输出：

["(substringof ('needle' ,name))", "((role eq 'needle') and (substringof ('needle' ,email)))", "(job eq 'needle')", "(office eq 'needle')"]

来源

2017-08-01 05:56:59 perigon

您还可以检查此

import re 
s = '(some text) or ((other text) and (some more text)) and (still more text)' 
find_string = re.findall(r'[(]{2}[a-z\s()]*[)]{2}|[(][a-z\s]*[)]', s) 
print(find_string)

输出：

['(some text)', '((other text) and (some more text))', '(still more text)']

编辑

find_string = re.findall(r'[(\s]{2}[a-z\s()]*[)\s]{2}|[(][a-z\s]*[)]', s)

来源

2017-08-01 06:02:17

这不是匹配括号的正确方法..如果在两个开放括号之间存在任何文本会怎么样？ –

@AvinashRaj，请给我一个样本字符串？谢谢。 –

用这个''（一些文本）或（（其他文本）和（一些更多的文本））和（更多文本）'字符串检查你的正则表达式。 –

回答

相关问题