该解决方案适用于任意嵌套的括号,其中一个正则表达式不能(s
是原始字符串):
from pyparsing import nestedExpr
def lst_to_parens(elt):
if isinstance(elt,list):
return '(' + ' '.join(lst_to_parens(e) for e in elt) + ')'
else:
return elt
split = nestedExpr('(',')').parseString('(' + s + ')').asList()
split_lists = [elt for elt in split[0] if isinstance(elt,list)]
print ([lst_to_parens(elt) for elt in split_lists])
输出:
['(some text)', '((other text) and (some more text))', '(still more text)']
对于OP真实的测试案例:
s = "(substringof('needle',name)) or ((role eq 'needle') and (substringof('needle',email))) or (job eq 'needle') or (office eq 'needle')"
输出:
["(substringof ('needle' ,name))", "((role eq 'needle') and (substringof ('needle' ,email)))", "(job eq 'needle')", "(office eq 'needle')"]
正则表达式不能很好地处理任意嵌套的内容。除了您向我们展示的示例之外,可能会有更多层嵌套括号。对于这种情况,使用解析器可能会比正则表达式更进一步。 –
这可能有所帮助:https://stackoverflow.com/questions/26633452/how-to-split-by-commas-that-are-not-within-parentheses –
这可能也是有用的:https://stackoverflow.com/questions/4284991/parsing-nested-parentheses-in-python-grab-content-by-level – perigon