2016-03-19 43 views
2

我有我用正则表达式来删除列表中的字符串,其完美的作品的空间列表 -两个列表有什么区别?

newrooms = re.sub(r'\s+', " ", str(newrooms)) 

原始列表看起来像 -

[['4  11-12pm', 'MR252 (30)'], ['5  10.30-12pm', 'MR252 (30)'], ['8  10-11am', 'MR252 (30)'], ['9  11-12pm', 'MR252 (30)'], ['10  10-11am', 'MR252 (30)'], ['10  11-12pm', 'MR251 (22)'], ['12  10-11am', 'MR107 (63)'], ['12  11-12pm', 'MR252 (30)'], ['17  10-11am', 'MR252 (30)'], ['18  11-12pm', 'MR252 (30)'], ['19  10-11am', 'MR252 (30)'], ['19  11-12pm', 'MR265 (24)'], ['20  10-11am', 'CB203 (26)'], ['20  11-12pm', 'MR252 (30)'], ['27  10-11am', 'MR252 (30)'], ['28  11-12pm', 'MR252 (30)'], ['29  10-11am', 'MR252 (30)'], ['42  11-12pm', 'MR252 (30)'], ['42  2-4pm    MA ONLY', 'MR252 (30)'], ['43  10-11am', 'MR252 (30)'], ['44  10-11am', ''], ['44  11-12pm', 'MR252 (30)']] 

打印newrooms [3]打印...... “[ '9 11-12pm', 'MR252(30)']”

利用re.sub到删除列表看起来像

[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)'], ['8 10-11am', 'MR252 (30)'], ['9 11-12pm', 'MR252 (30)'], ['10 10-11am', 'MR252 (30)'], ['10 11-12pm', 'MR251 (22)'], ['12 10-11am', 'MR107 (63)'], ['12 11-12pm', 'MR252 (30)'], ['17 10-11am', 'MR252 (30)'], ['18 11-12pm', 'MR252 (30)'], ['19 10-11am', 'MR252 (30)'], ['19 11-12pm', 'MR265 (24)'], ['20 10-11am', 'CB203 (26)'], ['20 11-12pm', 'MR252 (30)'], ['27 10-11am', 'MR252 (30)'], ['28 11-12pm', 'MR252 (30)'], ['29 10-11am', 'MR252 (30)'], ['42 11-12pm', 'MR252 (30)'], ['42 2-4pm MA ONLY', 'MR252 (30)'], ['43 10-11am', 'MR252 (30)'], ['44 10-11am', ''], ['44 11-12pm', 'MR252 (30)']] 
的空间后

它只是相同的(负的空间),但现在=

打印newrooms [3]打印...... “4”

所有代码在这里=

print newrooms[3] 
print newrooms 
newrooms = re.sub(r'\s+', " ", str(newrooms)) 
print newrooms[3] 
print newrooms 

为什么会出现列表现在不像列表一样行事?

好球员,我明白了,我的整个列表转换为与STR(newrooms)的字符串,是我应该做的是..

print newrooms[3] 
    print newrooms 
    for obj in newrooms: 
     obj[0] = re.sub(r'\s+', " ", (obj[0])) 
    print newrooms[3] 
    print newrooms 
+0

当你遇到这种情况时,使用'type'来查看是否有任何变化。当你应用你的正则表达式时,列表变成了一个字符串。 – Deusdeorum

+0

...因为**不是列表** – jonrsharpe

+2

@Hugo更具体地说,当OP应用str(newrooms)时,它变成了一个字符串... –

回答

4

newrooms = re.sub(r'\s+', " ", str(newrooms)) 

newrooms

,以前是 list(),变成了一个字符串。

print newrooms[3] 

打印该字符串中的第4个字符。 Python是鸭式变量,所以每个变量都可以灵活地适应你存储的内容。

1

您转换listnewrooms到一个串在这条线:

newrooms = re.sub(r'\s+', " ", str(newrooms)) 

所以它只是一个字符串,而不是一个列表了。 你想要做的是应用上的列表中的单个元素的替代:

newrooms = [ 
    [re.sub(r'\s+', " ", elem) for elem in sublist] 
    for sublist in newrooms 
] 

这导致:

>>> newrooms[3] 
['9 11-12pm', 'MR252 (30)'] 
0

它返回意外的结果,因为你之前转换列表中的字符串更换。试试这个:

import re 
newrooms = [['4  11-12pm', 'MR252 (30)'], ['5  10.30-12pm', 'MR252 (30)'], ['8  10-11am', 'MR252 (30)'], ['9  11-12pm', 'MR252 (30)'], ['10  10-11am', 'MR252 (30)'], ['10  11-12pm', 'MR251 (22)'], ['12  10-11am', 'MR107 (63)'], ['12  11-12pm', 'MR252 (30)'], ['17  10-11am', 'MR252 (30)'], ['18  11-12pm', 'MR252 (30)'], ['19  10-11am', 'MR252 (30)'], ['19  11-12pm', 'MR265 (24)'], ['20  10-11am', 'CB203 (26)'], ['20  11-12pm', 'MR252 (30)'], ['27  10-11am', 'MR252 (30)'], ['28  11-12pm', 'MR252 (30)'], ['29  10-11am', 'MR252 (30)'], ['42  11-12pm', 'MR252 (30)'], ['42  2-4pm    MA ONLY', 'MR252 (30)'], ['43  10-11am', 'MR252 (30)'], ['44  10-11am', ''], ['44  11-12pm', 'MR252 (30)']] 

newrooms = [[re.sub(r'\s+', " ", room) for room in rooms] for rooms in newrooms] 
print newrooms[3] 
1

你想要的是与在列表的列表中每个字符串中的一个空白来代替重复的空白序列。

你实际上做的是将列表转换为一个字符串,然后进行替换操作。

这里发生了什么 - 我将使用原来列表的缩短版的可读性:

>>> import re 
>>> newrooms = [['4  11-12pm', 'MR252 (30)'], ['5  10.30-12pm', 'MR252 (30)']] 
>>> newrooms_str = str(newrooms) 
>>> newrooms_str 
"[['4  11-12pm', 'MR252 (30)'], ['5  10.30-12pm', 'MR252 (30)']]" 
>>> newrooms_str = re.sub(r'\s+', " ", newrooms_str) 
>>> newrooms_str 
"[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]" 
>>> newrooms_str[3] 
'4' 

正如你所看到的,要传递一个字符串,re.sub,它返回一个字符串。该字符串的第四个字符是字符'4',您在看到newrooms_str[3]时会看到该字符。

为了得到你想要的结果,你需要在你的名单列表上的个人操作字符串:

>>> newrooms 
[['4  11-12pm', 'MR252 (30)'], ['5  10.30-12pm', 'MR252 (30)']] 
>>> newrooms = [[re.sub(r'\s+', " ", string) for string in sublist] for sublist in newrooms] 
>>> newrooms 
[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']] 
>>> newrooms[1] 
['5 10.30-12pm', 'MR252 (30)'] 
1

您可以使用str.joinstr.split在各子表上的每个字符串操作不能转换列表中的字符串:

l = [['4  11-12pm', 'MR252 (30)'], ['5  10.30-12pm', 'MR252 (30)'], ['8  10-11am', 'MR252 (30)'], ['9  11-12pm', 'MR252 (30)'], ['10  10-11am', 'MR252 (30)'], ['10  11-12pm', 'MR251 (22)'], ['12  10-11am', 'MR107 (63)'], ['12  11-12pm', 'MR252 (30)'], ['17  10-11am', 'MR252 (30)'], ['18  11-12pm', 'MR252 (30)'], ['19  10-11am', 'MR252 (30)'], ['19  11-12pm', 'MR265 (24)'], ['20  10-11am', 'CB203 (26)'], ['20  11-12pm', 'MR252 (30)'], ['27  10-11am', 'MR252 (30)'], ['28  11-12pm', 'MR252 (30)'], ['29  10-11am', 'MR252 (30)'], ['42  11-12pm', 'MR252 (30)'], ['42  2-4pm    MA ONLY', 'MR252 (30)'], ['43  10-11am', 'MR252 (30)'], ['44  10-11am', ''], ['44  11-12pm', 'MR252 (30)']] 

l[:] = [[" ".join(s.split()) for s in sub] for sub in l] 

from pprint import pprint as pp 

输出将是一个列表:

[['4 11-12pm', 'MR252 (30)'], 
['5 10.30-12pm', 'MR252 (30)'], 
['8 10-11am', 'MR252 (30)'], 
['9 11-12pm', 'MR252 (30)'], 
['10 10-11am', 'MR252 (30)'], 
['10 11-12pm', 'MR251 (22)'], 
['12 10-11am', 'MR107 (63)'], 
['12 11-12pm', 'MR252 (30)'], 
['17 10-11am', 'MR252 (30)'], 
['18 11-12pm', 'MR252 (30)'], 
['19 10-11am', 'MR252 (30)'], 
['19 11-12pm', 'MR265 (24)'], 
['20 10-11am', 'CB203 (26)'], 
['20 11-12pm', 'MR252 (30)'], 
['27 10-11am', 'MR252 (30)'], 
['28 11-12pm', 'MR252 (30)'], 
['29 10-11am', 'MR252 (30)'], 
['42 11-12pm', 'MR252 (30)'], 
['42 2-4pm MA ONLY', 'MR252 (30)'], 
['43 10-11am', 'MR252 (30)'], 
['44 10-11am', ''], 
['44 11-12pm', 'MR252 (30)']]