匹配一组值到另外一个文本文件

我有此信息的文本文件：匹配一组值到另外一个文本文件

1961 - Roger (Male) 
1962 - Roger (Male) 
1963 - Roger (Male) 
1963 - Jessica (Female) 
1964 - Jessica (Female) 
1965 - Jessica (Female) 
1966 - Jessica (Female)

如果我要搜索文件中的单词“罗杰”，我希望它打印在这个名字的相应年份，即1961年，1962年，1963年。对此，最好的办法是什么？

我用字典这样做，但后来意识到后来的字典不能有重复的值和1963年在文本文件中被提及两次，所以它没有工作。

我使用Python 3，谢谢。

来源

2012-11-17 Goose

还有什么你试过吗？ – martineau

使用一个'collections.defaultdict（list）'，其中的关键是名字（可能是性别），并且年份被附加到相应的值，该值将自动从空列表开始。 – martineau

使用字典的名称作为关键字并存储多年的列表：

In [1]: with open("data1.txt") as f: 
    ...:  dic={} 
    ...:  for line in f: 
    ...:   spl=line.split() 
    ...:   dic.setdefault(spl[2],[]).append(int(spl[0])) 
    ...:  for name in dic :  
    ...:   print (name,dic[name]) 
    ...:  

Roger [1961, 1962, 1963] 
Jessica [1963, 1964, 1965, 1966]

，或者您也可以使用collections.defaultdict：

In [2]: from collections import defaultdict 

In [3]: with open("data1.txt") as f: 
    ...:  dic=defaultdict(list) 
    ...:  for line in f: 
    ...:   
    ...:   spl=line.split() 
    ...:   dic[spl[2]].append(int(spl[0])) 
    ...:  for name in dic:  
    ...:   print name,dic[name] 
    ...:   
Roger [1961, 1962, 1963] 
Jessica [1963, 1964, 1965, 1966]

来源

2012-11-17 04:26:44

你再一次对Ashwini有很大的帮助。它的作品，但是，我有实际的文件中的一些名称，有中间名，所以spl [2]不会一直工作。我做了line.split（' - '）来解决这个问题，但它总是在每行的末尾产生一个“\ n”，为什么？ – Goose

@你可以使用'strip（）'，或简单地'line.strip（'\ n'）。split（' - '）'来读取'\ n'。 –

得到*摆脱那 –

为什么你不能使用在名字的字典和索引（如Roger）为重点，并有值的年（在这里[1961,1962,1963]？列表是不是会为你工作？

末

所以循环的你随着年龄的增长uniquified作为值的所有名称是你仿佛想

来源

2012-11-17 04:16:23

我试图使用字典的方法，所以我有钥匙作为年，并作为名称的价值，当我搜索字典的值匹配“罗杰”它与1961年，1962年，但不是1963年，因为杰西卡共享当年以及。 – Goose

具有“罗杰”作为键和“年”作为值。那么它会没事的。 –

使用tuples可以将它们存储在列表和遍历

说你的名单看起来是这样的。：

data = [(1961, 'Rodger', 'Male'), 
     (1962, 'Rodger', 'Male'), 
     (1963, 'Rodger', 'Male'), 
     (1963, 'Jessica', 'Female')]

您可以像这样运行的查询就可以了：

或者使用更Python代码：

for year, name, sex in data: 
    if year >= 1962: 
     print "In {}, {} was {}".format(year, name, sex)

1962年，罗杰是男
1963年，罗杰是男
1963年，杰西卡女

来源

2012-11-17 04:21:19 FakeRainBrigand

您可以随时使用正则表达式。

import re 

f = open('names.txt') 
name = 'Roger' 

for line in f.readlines(): 
    match = re.search(r'([0-9]+) - %s' % name, line) 
    if match: 
     print match.group(1)

来源

2012-11-17 04:27:33 austin

正如我在评论中建议：

from collections import defaultdict 

result = defaultdict(list) 
with open('data.txt', 'rt') as input: 
    for line in input: 
     year, person = [item.strip() for item in line.split('-')] 
     result[person].append(year) 

for person, years in result.items(): 
    print(person, years, sep=': ')

输出：

Roger (Male): ['1961', '1962', '1963'] 
Jessica (Female): ['1963', '1964', '1965', '1966']

来源

2012-11-17 04:33:43 martineau

匹配一组值到另外一个文本文件

回答

相关问题