集(通常情况下)约一个数量级的速度更快,即使你不填充指数的时间提前:
r100 = range(100)
r2 = range(3, 40, 3)
# Find indices in r100 that aren't in r2.
# This is a set difference (or symmetric difference)
## Set methods
# Precalculated is fastest:
sr100 = set(r100)
sr2 = set(r2)
%timeit sr100 - sr2
100000 loops, best of 3: 3.84 us per loop
# Non-precalculated is still faster:
%timeit set(range(100))^set(range(3,40,3))
100000 loops, best of 3: 9.76 us per loop
%timeit set(xrange(100))^set(xrange(3,40,3))
100000 loops, best of 3: 8.84 us per loop
# Precalculating the original indices still helps, if you can hold it in memory:
%timeit sr100^set(xrange(3,40,3))
100000 loops, best of 3: 4.87 us per loop
# This is true even including converting back to list, and sorting (if necessary):
%timeit [x for x in sr100^set(xrange(3,40,3))]
100000 loops, best of 3: 9.02 us per loop
%timeit sorted(x for x in sr100^set(xrange(3,40,3)))
100000 loops, best of 3: 15 us per loop
## List comprehension:
# Precalculated indices
%timeit [x for x in r100 if x not in r2]
10000 loops, best of 3: 30.5 us per loop
# Non-precalculated indices, using xrange
%timeit [x for x in xrange(100) if x not in xrange(3, 40, 3)]
10000 loops, best of 3: 65.8 us per loop
# The cost appears to be in the second xrange?
%timeit [x for x in r100 if x not in xrange(3, 40, 3)]
10000 loops, best of 3: 64.3 us per loop
%timeit [x for x in xrange(100) if x not in r2]
10000 loops, best of 3: 29.9 us per loop
# xrange is not really any faster than range here - uses less memory, but still have
# to walk through entire list
%timeit [x for x in range(100) if x not in range(3, 40, 3)]
10000 loops, best of 3: 63.5 us per loop
我看,所以通过'xrange'我不是计算整个索引组。这被认为更快。 – Acorbe
@Acorbe:'xrange()'会员测试的成本不变,是的。 –