2014-02-20 25 views
0

我有一个包含数据行的文件。每行以一个id开头,后面跟着由逗号分隔的一组固定属性。如果在Python中为具有相同ID的行找到匹配项,则从文件中获取值

123,2,kent,..., 
123,2,bob,..., 
123,2,sarah,..., 
123,8,may,..., 

154,4,sheila,..., 
154,4,jeff,..., 

175,3,bob,..., 

249,2,jack,..., 
249,5,bob,..., 
249,3,rose,..., 

如果条件符合,我想获得一个属性。条件是如果'bob'出现在同一个id中,则获取后面的第二个属性的值。

For example: 

id: 123 
values returned: 2, 8 

id: 249 
values returned: 3 

Java有一个双循环,我可以使用,但我想试试这在Python。任何建议都会很棒。

+0

为什么ID 249的值'是returned''3'代替'2,5,3'? – aIKid

+0

啊等待我看.. – aIKid

回答

1

我想出了与使用groupbydropwhile一个(也许)更Python的解决方案。这个方法与下面的方法产生的结果相同,但我认为它更漂亮。:)标志,“curr_id”和类似的东西不是很pythonic,如果可能的话应该避免!

import csv 
from itertools import groupby, dropwhile 

goal = 'bob' 
ids = {} 

with open('my_data.csv') as ifile: 
    reader = csv.reader(ifile) 
    for key, rows in groupby(reader, key=lambda r: r[0]): 
     matched_rows = list(dropwhile(lambda r: r[2] != goal, rows)) 
     if len(matched_rows) > 1: 
      ids[key] = [row[1] for row in matched_rows[1:]] 

print ids 

(下面第一溶液)

from collections import defaultdict 
import csv 

curr_id = None 
found = False 
goal = 'bob' 
ids = defaultdict(list) 

with open('my_data.csv') as ifile: 
    for row in csv.reader(ifile): 
     if row[0] != curr_id: 
      found = False 
      curr_id = row[0] 
     if found: 
      ids[curr_id].append(row[1]) 
     elif row[2] == goal: 
      found = True 

print dict(ids) 

输出:

{'123': ['2', '8'], '249': ['3']} 
+0

+1啊,这比我的答案更好,但它有一个空的列表175 – bernie

+0

@bernie谢谢男人,但我不满意这种解决方案 - 我不喜欢使用标志,curr_id和东西... :) –

+0

@bernie我发现另一个解决方案,如果它感兴趣:) –

0

只需设置标志或东西,你遍历:

name = 'bob' 
id = '123' 
found = False 

for line in file: 
    l = line.split(',') 
    if l[0] == id: 
     if l[2] == name: 
      found = True 
     if found: 
      print l[1] 
0
import csv, collections as co, cStringIO as StringIO 

s = '''123,2,kent,..., 
123,2,bob,..., 
123,2,sarah,..., 
123,8,may,..., 
154,4,sheila,..., 
154,4,jeff,..., 
175,3,bob,..., 
249,2,jack,..., 
249,5,bob,..., 
249,3,rose,...,''' 

filelikeobject = StringIO.StringIO(s) 
dd = co.defaultdict(list) 
cr = csv.reader(filelikeobject) 
for line in cr: 
    if line[2] == 'bob': 
    dd[line[0]]; continue 
    if line[0] in dd: 
    dd[line[0]].append(line[1]) 

结果:

>>> dd 
defaultdict(<type 'list'>, {'175': [], '123': ['2', '8'], '249': ['3']}) 
相关问题