比较两个点元组列表的更快方法？

我有两个列表（可能是也可能不是相同的长度）。在每个列表中，有一系列两点的元组（基本上是X，Y值）。比较两个点元组列表的更快方法？

我比较两个列表对彼此找到两个具有相似点值的点。我尝试了列表理解技术，但它真的让列表中的嵌套元组感到困惑，并且我无法让它工作。

这是做这个最好的（最快的）方法吗？我觉得可能会有更多的Pythonic这样做。

说我有两个列表：

pointPairA = [(2,1), (4,8)] 
pointPairB = [(3,2), (10,2), (4,2)]

然后空列表，用于存储对和解包的元组的公差值仅存储配对

matchedPairs = [] 
tolerance = 2

然后这个循环，比较差异，并将它们添加到matchedPairs列表以指示匹配。

for pointPairA in pointPairListA: 
    for pointPairB in pointPairListB: 
     ## Assign the current X,Y values for each pair 
     pointPairA_x, pointPairA_y = pointPairA 
     pointPairB_x, pointPairB_x = pointPairB 

     ## Get the difference of each set of points 
     xDiff = abs(pointPairA_x - pointPairB_x) 
     yDiff = abs(pointPairA1_y - pointPairB_y) 

     if xDiff < tolerance and yDiff < tolerance: 
      matchedPairs.append((pointPairA, pointPairB))

这将导致matchedPairs这样看，里面都指向元组的元组：

[((2,1), (3,2)), ((2,1), (4,2))]

来源

2011-06-08 STH

的列表中一个如果你可以用“距离”，而不是为容忍广场，你可以使用复杂的数字，而不是元组例如。 '[2 + 1j，4 + 8j]'。然后你可以比较'abs（pt1-pt2）'和容差 – 2011-06-08 02:01:06

这里pointpairA是单一的名单和pointpairB将是20K

from collections import defaultdict 
from itertools import product 

pointPairA = [(2,1), (4,8)] 
pointPairB = [(3,2), (10,2), (4,2)] 
tolerance = 2 

dA = defaultdict(list) 
tolrange = range(-tolerance, tolerance+1) 
for pA, dx, dy in product(pointPairA, tolrange, tolrange): 
    dA[pA[0]+dx,pA[1]+dy].append(pA) 

# you would have a loop here though the 20k lists 
matchedPairs = [(pA, pB) for pB in pointPairB for pA in dA[pB]] 

print matchedPairs

来源

2011-06-08 04:08:13

+1：gnibbler先到那里:) – tzot 2011-06-10 08:41:40

随着列表理解：

[(pa, pb) for pa in pointPairA for pb in pointPairB \ 
      if abs(pa[0]-pb[0]) <= tolerance and abs(pa[1]-pb[1]) <= tolerance]

略多于你的循环要快得多：

(for 1 million executions) 

>>> (list comprehension).timeit() 
2.1963138580322266 s 

>>> (your method).timeit() 
2.454944133758545 s

来源

2011-06-08 01:14:16 manji

我明白我做错了，谢谢你的例子。这正是我需要的一个班轮。稍微快一点，我肯定会加起来：我有一个列表，我可以比较其他20k个列表。 – STH 2011-06-08 01:45:54

@STH，由于您将一个列表与20k个其他列表进行比较，因此可能需要花费一些时间从一个列表中创建一个字典或一组列表，以便为其他20k个列表快速查找。这些值是否始终是整数？对于2的容差，字典将是列表大小的25倍，但是20k比较将是O（N） – 2011-06-08 02:06:28

@gnibbler你的意思是将第一个列表设置为字典或集合，而不是20k其他，对吗？这些值将始终是整数。腌制后，20k列表存储在MySQL数据库中。 – STH 2011-06-08 02:09:54

如果这些列表很大，我会建议找到一个更快的算法...

我首先将这两个对的列表按对中的（x，y）之和排序。（因为两点只有在它们的总和接近时才能关闭）

对于第一个列表中的任何点，这将严重限制您需要在第二个列表中搜索的范围。跟踪第二个列表上的“滑动窗口”，对应于其总和在第一个列表的当前元素的总和的2*tolerance内的元素。（实际上，你只需要跟踪滑动窗口的开始...）

假设tolerance相当小，这应该将您的O（n^2）操作转换为O（n log n）。

来源

2011-06-08 01:18:46 Nemo

对不起，我没有提到这个，名单根本不大。事实上，目前它们的长度不会超过15个元组，其中大部分长度是14个。 – STH 2011-06-08 01:44:30

比较两个点元组列表的更快方法？

回答

相关问题