比较列表

我有8个表（一月，二月，三月，四月，五月，六月，七月，八月），每个包含列表格式的名称，即比较列表

['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba', 'Louis le Roy']

等

如何我是否按顺序比较这些列表，并查看某个名称何时出现（即订阅）以及名称何时消失（即取消订阅）。

所以，说约翰史密斯直到二月才出现，我想要这个信息。假设他在7月取消订阅，我也需要这些信息（这比FAR更重要）。

来源

2011-09-09 Andrew Alexander

我觉得你的数据结构是不适合这个。也许把这些发送到数据库，然后你可以连接名称，加入的时间和剩下的时间？ – Daenyth

是否有多个订阅名称？如果是这样，你想要哪一个？ –

data = { 
'jan': ['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba'], 
'feb': ['Louis le Roy', 'John Smith'], 
'mar': ['Cat Stevens', 'Louis le Roy'] 
} 

from itertools import izip 

keys = 'jan feb mar'.split() 
for m1,m2 in izip(keys,keys[1:]): 
    a = set(data[m1]) 
    b = set(data[m2]) 
    print m1, '\n\tsubscribed:', ','.join(b-a), '\n\tquit:', ','.join(a - b)

结果：

jan 
    subscribed: Louis le Roy 
    quit: Andrew Alexander,Cat Stevens,El Gordo Baba 
feb 
    subscribed: Cat Stevens 
    quit: John Smith

来源

2011-09-09 20:23:14 fabrizioM

请勿使用列表，请改为使用set。

你可以找到谁（UN）简单地用差集jan和feb之间订阅：

subs = feb - jan 
unsubs = jan - feb

话虽这么说，你会过得更好以下Daenyth的建议。将这些数据放入数据库中，添加一个joined和left日期字段，您将拥有比仅仅几个月更精细的粒度，并且不需要存储重复的数据。

来源

2011-09-09 18:34:20 NullUserException

data = { 
'jan': ['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba'], 
'feb': ['Louis le Roy', 'John Smith'], 
'mar': ['Cat Stevens', 'Louis le Roy'] 
} 

subs = {} 
unsubs = {} 
for mon in data: 
    for name in data[mon]: 
     if name not in subs: 
      subs[name] = mon 
     else: 
      unsubs[name] = mon 
>>> subs 
{'Andrew Alexander': 'jan', 'Louis le Roy': 'mar', 'John Smith': 'jan', 'El Gordo Baba': 'jan', 'Cat Stevens': 'jan'} 
>>> unsubs 
{'Louis le Roy': 'feb', 'John Smith': 'feb', 'Cat Stevens': 'mar'}

来源

2011-09-09 18:36:46

担任首发：

from collections import defaultdict 
dd = dict(jan=(0,jan), feb=(1, feb), ...) 

appearances = defaultdict(list) 

for k, (i, li) in dd.items(): 
    for name in li: 
     appearances[name].append((i,k)) 

for name in appearances.keys(): 
    months = [ (name, i) for i, name in sorted(appearances[name]) ] 
    print name, months

你得到每个名称对(month, index)出现名字的这个排序列表。 index是月份的索引。现在您可以检查差距，获取最小索引和最大索引。

来源

2011-09-09 18:37:46 rocksportrocker

下面是一个简单的例子：

jan,feb,mar,apr,may,jun,jul,aug = [1],[1,2],[1,2,3],[1,2,3,4],[2,3,4],[3,4],[4],[] 
months = [set(m) for m in [jan,feb,mar,apr,may,jun,jul,aug]] 
changes = [(list(b-a), list(a-b)) for a, b in zip(months, months[1:])] 

>>> changes 
[([2], []), ([3], []), ([4], []), ([], [1]), ([], [2]), ([], [3]), ([], [4])]

在changes每个元素都是从一个月到下一个，其中的元组的第一项是添加的所有列表的过渡，而第二元组中的项目是所有剩下的列表。

来源

2011-09-09 18:38:47

回答

相关问题