2016-03-23 250 views
9

检查numpy数组是否包含另一个数组的任何元素的最佳方法是什么?python:检查numpy数组是否包含另一个数组的任何元素

例如:

array1 = [10,5,4,13,10,1,1,22,7,3,15,9] 
array2 = [3,4,9,10,13,15,16,18,19,20,21,22,23]` 

我希望得到一个True如果array1包含array2任何值,否则False

+2

您可以使用'set' – Nilesh

+2

使用'np.any(np.in1d(array2,array1))'。 – Norman

回答

14

使用熊猫,你可以使用isin

a1 = np.array([10,5,4,13,10,1,1,22,7,3,15,9]) 
a2 = np.array([3,4,9,10,13,15,16,18,19,20,21,22,23]) 

>>> pd.Series(a1).isin(a2).any() 
True 

并采用in1d numpy的功能(每从@Norman评论):

>>> np.any(np.in1d(a1, a2)) 
True 

对于小数组,如在这个例子中,使用集合的解决方案是明显的赢家。对于较大的不相似的阵列(即不重叠),熊猫和Numpy解决方案更快。但是,np.intersect1d似乎擅长更大的阵列。

小阵列(12-13元素)

%timeit set(array1) & set(array2) 
The slowest run took 4.22 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 1.69 µs per loop 

%timeit any(i in a1 for i in a2) 
The slowest run took 12.29 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 1.88 µs per loop 

%timeit np.intersect1d(a1, a2) 
The slowest run took 10.29 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 15.6 µs per loop 

%timeit np.any(np.in1d(a1, a2)) 
10000 loops, best of 3: 27.1 µs per loop 

%timeit pd.Series(a1).isin(a2).any() 
10000 loops, best of 3: 135 µs per loop 

使用阵列由100K元件(无重叠)

a3 = np.random.randint(0, 100000, 100000) 
a4 = a3 + 100000 

%timeit np.intersect1d(a3, a4) 
100 loops, best of 3: 13.8 ms per loop  

%timeit pd.Series(a3).isin(a4).any() 
100 loops, best of 3: 18.3 ms per loop 

%timeit np.any(np.in1d(a3, a4)) 
100 loops, best of 3: 18.4 ms per loop 

%timeit set(a3) & set(a4) 
10 loops, best of 3: 23.6 ms per loop 

%timeit any(i in a3 for i in a4) 
1 loops, best of 3: 34.5 s per loop 
+0

我在我的评论中交换了阵列。我纠正了它。 – Norman

+0

@Norman订单是否重要?如果我们正在测试以检查它们是否共享单个值,我不这么认为。 – Alexander

+0

哦,是的,它迟到了:-)然而,出于性能原因,人们可能会首先放置较短的阵列。 – Norman

6

你可以试试这个

>>> array1 = [10,5,4,13,10,1,1,22,7,3,15,9] 
>>> array2 = [3,4,9,10,13,15,16,18,19,20,21,22,23] 
>>> set(array1) & set(array2) 
set([3, 4, 9, 10, 13, 15, 22]) 

如果你得到的结果意味着有两个数组中的共同要素。

如果结果为空意味着没有公共元素。

1

可以使用any内置函数和列表理解:

>>> array1 = [10,5,4,13,10,1,1,22,7,3,15,9] 
>>> array2 = [3,4,9,10,13,15,16,18,19,20,21,22,23] 
>>> any(i in array2 for i in array1) 
True 
相关问题