6
我有一个500万字符串元素的列表,它们被存储为一个pickle对象。python list(set(a))每次都改变顺序吗?
a = ['https://en.wikipedia.org/wiki/Data_structure','https://en.wikipedia.org/wiki/Data_mining','https://en.wikipedia.org/wiki/Statistical_learning_theory','https://en.wikipedia.org/wiki/Machine_learning','https://en.wikipedia.org/wiki/Computer_science','https://en.wikipedia.org/wiki/Information_theory','https://en.wikipedia.org/wiki/Statistics','https://en.wikipedia.org/wiki/Mathematics','https://en.wikipedia.org/wiki/Signal_processing','https://en.wikipedia.org/wiki/Sorting_algorithm','https://en.wikipedia.org/wiki/Data_structure','https://en.wikipedia.org/wiki/Quicksort','https://en.wikipedia.org/wiki/Merge_sort','https://en.wikipedia.org/wiki/Heapsort','https://en.wikipedia.org/wiki/Insertion_sort','https://en.wikipedia.org/wiki/Introsort','https://en.wikipedia.org/wiki/Selection_sort','https://en.wikipedia.org/wiki/Timsort','https://en.wikipedia.org/wiki/Cubesort','https://en.wikipedia.org/wiki/Shellsort']
要删除重复,我用set(a)
,然后我做了一个列表再通过list(set(a))
。
我的问题是:
即使我重新启动蟒蛇,并宣读了泡菜文件列表中,将在list(set(a))
的顺序是一样的每一次?
我很想知道这个散列 - >列表排序是如何工作的。
我测试了一个小的数据集,它似乎有一个一致的排序。
In [50]: a = ['x','y','z','k']
In [51]: a
['x', 'y', 'z', 'k']
In [52]: list(set(a))
['y', 'x', 'k', 'z']
In [53]: b=list(set(a))
In [54]: list(set(b))
['y', 'x', 'k', 'z']
In [55]: del b
In [56]: b=list(set(a))
In [57]: b
['y', 'x', 'k', 'z']
。 –
对于初学者来说,哈希的顺序不能保证,所以列表的顺序也不能保证。 – Makoto
我想你可以使用[ordered-set](https://pypi.python.org/pypi/ordered-set)而不是'set' – MaxU