的Python：提取所有的子串在标签之间串

内我有以下的格式的大字符串：的Python：提取所有的子串在标签之间串

'324/;.ke5 efwef dwe,werwrf <>i want this<> ergy;'56\45,> thu ;lokr<>i want this<> htur ;''\> htur> jur'

我知道我可以做线沿线的东西：

result= text.partition('<>')[-1].rpartition('<>')[0]

但是这只会给我第一个<>和最后一个<>之间的字符串，我怎样才能遍历整个字符串并提取每个对应的标记对之间的内容？

来源

2016-03-29 Mustard Tiger

您可以使用正则表达式和findall()：

>>> import re 
>>> s = "324/;.ke5 efwef dwe,werwrf <>i want this<> ergy;'56\45,> thu ;lokr<>i want this<> htur ;''\> htur> jur" 
>>> re.findall(r"<>(.*?)<>", s) 
['i want this', 'i want this']

其中(.*?)是将任意次数匹配任何字符在non-greedy模式捕获组。

来源

2016-03-29 21:17:53 alecxe

嗨，我厌倦了使用你的方法，它起初工作，但后来我试图用它找到'\/\ /'标签内的一切，我停止工作，你知道这是为什么吗？ @alecxe –

@abcla我认为这可以并应该作为一个单独的问题。如果您需要帮助，请考虑发布 - 确保提供所有详细信息。要关闭此主题，请考虑接受答案，谢谢。 – alecxe

我觉得string.split()是你想要什么：

>>> text = """'324/;.ke5 efwef dwe,werwrf <>i want this<> ergy;'56\45,> thu ;lokr<>i want this<> htur ;''\> htur> jur'""" 
>>> print text.split('<>')[1:-1] 
['i want this', " ergy;'56%,> thu ;lokr", 'i want this']

的split()方法，让你在参数用作分隔符的字符串列表。（https://docs.python.org/2/library/string.html#string.split）Then，[1:-1]为您提供了一个没有第一个和最后一个元素的列表。

来源

2016-03-29 21:42:23 David

的Python：提取所有的子串在标签之间串

回答

相关问题