2015-09-03 43 views
0

我有以下格式的数据(CSV文件):组合多个值,在Python

id, review 
1, the service was great! 
1, staff was friendly. 
2, nice location 
2, but the place was not clean 
2, the motel was okay 
3, i wouldn't stay there next time 
3, do not stay there 

我想数据更改为以下格式:

1, the service was great! staff was friendly. 
2, nice location but the place was not clean the motel was okay 
3, i wouldn't stay there next time do not stay there 

任何帮助将不胜感激。

+0

你有什么迄今所做的:读取该文件假设它是一个真正的CSV文件,与,分隔符的代码?由于最后一行不是以'1'开始,而是在之前被添加到行中,所以匹配标准是什么? – albert

+0

看看'itertools.groupby'。 – Kevin

+0

@albert我纠正了输出。 – kevin

回答

1

您可以使用itertools.groupby来分组具有相同编号的连续条目。

import itertools, operator, csv 
with open("test.csv") as f: 
    reader = csv.reader(f, delimiter=",") 
    next(reader) # skip header line 
    for key, group in itertools.groupby(reader, key=operator.itemgetter(0)): 
     print key, ' '.join(g[1] for g in group) 

输出:

1 the service was great! staff was friendly. 
2 nice location but the place was not clean the motel was okay 
3 i wouldn't stay there next time do not stay there 

注:

id, review 
1, the service was great! 
... 
+0

这正是我正在寻找的。 – kevin