根据python中的特殊字符将动态列表拆分为子列表

-1

我在python中有一个基本问题，那就是我试图长时间找到解决方案，但是我无法获得正确的输出。根据python中的特殊字符将动态列表拆分为子列表

textvalues=[['1 of 2 DOCUMENTS', 'The New York Times', 'March 17, 2016 Thursday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section A; Column 0; Classified; Pg. 19', 'LENGTH: 176 words', 'LOAD-DATE: March 17, 2016', 'Copyright 2016 The New York Times Company', '', '2 of 2 DOCUMENTS', 'The New York Times', 'March 16, 2016 Wednesday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section B; Column 0; Classified; Pg. 16', 'LENGTH: 176 words', 'LOAD-DATE: March 16, 2016', 'Copyright 2016 The New York Times Company']]

这里我需要根据“特殊字符”将上面的列表拆分成子列表。上面的列表是样本列表，主列表是动态的，列表的长度可能不同。在任何情况下，列表都需要用“'字符分隔。

解决方案，我曾尝试：

MainText = str(textvalues) 
split_index = MainText.index('',) 
l2 = MainText[:split_index] 
print(l2)

预期的解决方案：

[['1 of 2 DOCUMENTS', 'The New York Times', 'March 17, 2016 Thursday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section A; Column 0; Classified; Pg. 19', 'LENGTH: 176 words', 'LOAD-DATE: March 17, 2016', 'Copyright 2016 The New York Times Company'] ,['2 of 2 DOCUMENTS', 'The New York Times', 'March 16, 2016 Wednesday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section B; Column 0; Classified; Pg. 16', 'LENGTH: 176 words', 'LOAD-DATE: March 16, 2016', 'Copyright 2016 The New York Times Company']]

请帮我解决这个问题。由于

来源

2016-12-07 Mho

检查右腿的解决方案。它适用于一些修改。在他的回答的评论中看到我的代码。 – MYGz

检查我的解决方案，如果它适合你。 – MYGz

import itertools 

textvalues=[['1 of 2 DOCUMENTS', 'The New York Times', 'March 17, 2016 Thursday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section A; Column 0; Classified; Pg. 19', 'LENGTH: 176 words', 'LOAD-DATE: March 17, 2016', 'Copyright 2016 The New York Times Company', '', '2 of 2 DOCUMENTS', 'The New York Times', 'March 16, 2016 Wednesday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section B; Column 0; Classified; Pg. 16', 'LENGTH: 176 words', 'LOAD-DATE: March 16, 2016', 'Copyright 2016 The New York Times Company']] 
groups = [] 
for a,b in itertools.groupby(textvalues[0], lambda x: x is not ''): 
    if a: 
     groups.append(list(b)) 
print groups

输出：

[['1 of 2 DOCUMENTS', 'The New York Times', 'March 17, 2016 Thursday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section A; Column 0; Classified; Pg. 19', 'LENGTH: 176 words', 'LOAD-DATE: March 17, 2016', 'Copyright 2016 The New York Times Company'], ['2 of 2 DOCUMENTS', 'The New York Times', 'March 16, 2016 Wednesday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section B; Column 0; Classified; Pg. 16', 'LENGTH: 176 words', 'LOAD-DATE: March 16, 2016', 'Copyright 2016 The New York Times Company']]

来源

2016-12-07 04:03:09 MYGz

好的解决方案。非常棘手。感谢分享它。 –

基本上，你可以遍历的内容，存储在缓冲区中的子串，并转储缓冲区主列表跨越''分离器何时到来：

result = list() 
line = list() 
for element in textvalues[0]: 
    if element != '': 
     line.append(element) 
    else: 
     result.append(line) 
     line = list()

来源

2016-12-07 04:43:53

修复您的解决方案。检查并编辑你的答案。 'textvalues = [['asd'，''，'asd d'，''，'c as d'，''，'asd f'，''，'lskd']] result = [] line = [] 为元件在textvalues [0]：如果元素= ''： line.append（元件）否则： result.append（线）线= [] 否则： result.append（线）打印结果' – MYGz

上述代码的输出：'[['asd']，['asd d']，['c as d']，['asd f']，['lskd']]' – MYGz

它引发错误，因为多个其他人在那里。 – Mho

textvalues=[['1 of 2 DOCUMENTS', 'The New York Times', 'March 17, 2016 Thursday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section A; Column 0; Classified; Pg. 19', 'LENGTH: 176 words', 'LOAD-DATE: March 17, 2016', 'Copyright 2016 The New York Times Company', '', '2 of 2 DOCUMENTS', 'The New York Times', 'March 16, 2016 Wednesday\xa0\xa0Late Edition - Final', 'Paid Notice: Deaths THORNTON, ROBERT', 'SECTION: Section B; Column 0; Classified; Pg. 16', 'LENGTH: 176 words', 'LOAD-DATE: March 16, 2016', 'Copyright 2016 The New York Times Company']] 

textvalues2 = [] 

for i in ','.join(i for i in textvalues[0]).split(',,') : 
    textvalues2.append(i.split(','))

来源

2016-12-07 06:33:02 jwdasdk

根据python中的特殊字符将动态列表拆分为子列表

回答

相关问题