2014-01-21 52 views

回答

5

最好的办法(算法),是不要自己动手!

>>> from collections import Counter 
>>> L=['d','f','d','c','c','f','d','f'] 
>>> Counter(L) 
Counter({'d': 3, 'f': 3, 'c': 2}) 

如果名单上坚持:

>>> Counter(L).items() 
[('c', 2), ('d', 3), ('f', 3)] 
2

我觉得一本字典是为了这个美好的:

>>> from collections import Counter 
>>> L = ['d','f','d','c','c','f','d','f'] 
>>> Counter(L) 
Counter({'d': 3, 'f': 3, 'c': 2}) 

不过,如果你是坚定的关于列表的列表:

>>> L=['d','f','d','c','c','f','d','f'] 
>>> from collections import Counter 
>>> var = Counter(L) 
>>> [[key, value] for key, value in var.items()] 
[['c', 2], ['d', 3], ['f', 3]] 
1
L=['d','f','d','c','c','f','d','f'] 
from collections import Counter 
print Counter(L) 

输出

Counter({'d': 3, 'f': 3, 'c': 2}) 

您可以使用Counter.most_common方法得到的结果就像一个排序DA使用itertools.groupby一种可能的解决这个

print Counter(L).most_common() 

输出

[('d', 3), ('f', 3), ('c', 2)] 
1

TA

实施

from itertools import groupby 
[[k, len(list(v))] for k, v in groupby(sorted(L))] 

输出

[['c', 2], ['d', 3], ['f', 3]] 

性能比较

In [9]: L = [choice(ascii_letters) for _ in range(1000)] 

    In [10]: %timeit [[k, len(list(v))] for k, v in groupby(sorted(L))] 
    1000 loops, best of 3: 271 us per loop 

    In [11]: %timeit Counter(L).items() 
    1000 loops, best of 3: 306 us per loop 

注意

应当指出的是,在散列数据在柜台解决方案的开销,过冲排序复杂性Tim's Sort

+0

第一个样品是错的 - 你衡量字符串创建的时间。在我的机器上:列表升值 - 250美元; __Counter__ without __items()__ - 232 us;与__items__ - 239我们。列表增值是最慢的 – volcano

+0

@volcano:我看不到,在第一个示例中创建字符串:-) – Abhijit

+0

那么,你作弊并改变了你的答案:)。 。无论如何,结果似乎取决于实施。我已经运行了几次 - 纯__Counter__总是赢。我认为大拇指的规则是 - 如果你有特定的API来为你做某件事情 - 就这样做;在大多数情况下,它会更有效率 – volcano