2012-06-27 35 views
0

我在努力将某个列表分割成若干索引。尽管我能够一次完成一件作品,但我还没有达到能够让我跳过分段的表达方式。使用索引拆分列表

import re 

# Creating list to split 

list = ['Leading', 'text', 'of', 'no', 'interest', '1.', 'Here', 'begins', 'section', '1', '2.', 'This', 'is', 'section', '2', '3.', 'Now', 'we', `enter code here`'have', 'section', '3'] 


# Identifying where sections begin and end 

section_ids = [i for i, item in enumerate(list) if re.search('[0-9]+\.(?![0-9])', item)] 


# Simple creation of a new list for each section, piece by piece 

section1 = list[section_ids[0]:section_ids[1]] 
section2 = list[section_ids[1]:section_ids[2]] 
section3 = list[section_ids[2]:] 


# Iterative creation of a new list for each claim - DOES NOT WORK 

for i in range(len(section_ids)): 
    if i < max(range(len(section_ids))): 
      section[i] = list[section_ids[i] : list[section_ids[i + 1]] 
    else: 
      section[i] = list[section_ids[i] : ] 
    print section[i] 

# This is what I'd like to get 

# ['1.', 'Here', 'begins', 'section', '1'] 
# ['2.', 'This', 'is', 'section', '2'] 
# ['3.', 'Now', 'we', 'have', 'section', '3'] 
+0

你的意思是有'反引号中的第3行输入代码here'? – jedwards

+0

你的'section_ids'总是从1开始,精确地增加1,并且随着你从左到右递增? – jedwards

+2

不利于阴影'list'内建 –

回答

0
for i,j in map(None, section_ids, section_ids[1:]): 
    print my_list[i:j] 

itertools版本将更加有效,如果section_ids大

from itertools import izip_longest, islice 
for i,j in izip_longest(section_ids, islice(section_ids, 1, None)): 
    print my_list[i:j] 
0

我是能够产生所需的输出用下面的代码:

section=[] 
for i,v in enumerate(section_ids+[len(list)]): 
    if i==0:continue 
    section.append(list[section_ids[i-1]:v]) 
0

你试图实现这样的事情:

>>> section = [] # list to hold sublists .... 
>>> for index, location in enumerate(section_ids): 
...  if location != section_ids[-1]: # assume its not the last one 
...   section.append(list[location:section_ids[index + 1]]) 
...  else: 
...   section.append(list[location:]) 
...  print section[-1] 
... 
['1.', 'Here', 'begins', 'section', '1'] 
['2.', 'This', 'is', 'section', '2'] 
['3.', 'Now', 'we', 'have', 'section', '3'] 
>>> 

或:

>>> import re 
>>> from pprint import pprint 
>>> values = ['Leading', 'text', 'of', 'no', 'interest', '1.', 'Here', 'begins', 'section', '1', '2.', 'This', 'is', 'section', '2', '3.', 'Now', 'we', 'have', 'section', '3'] 
>>> section_ids = [i for i, item in enumerate(values) if re.search('[0-9]+\.(?![0-9])', item)] + [len(values)] 
>>> section = [values[location:section_ids[index + 1]] for index, location in enumerate(section_ids) if location != section_ids[-1]] 
>>> pprint(section) 
[['1.', 'Here', 'begins', 'section', '1'], 
['2.', 'This', 'is', 'section', '2'], 
['3.', 'Now', 'we', 'have', 'section', '3']]