numpy - count数组相等

我想要计算分裂一个大矩阵后遇到的相等矩阵的个数。numpy - count数组相等

mat1 = np.zeros((4, 8)) 

split4x4 = np.split(mat1, 4)

现在我想知道有多少等于矩阵是在split4x4，但collections.Counter(split4x4)抛出一个错误。有没有一种内置的方式来做到这一点？

来源

2016-08-22 andandandand

启发我是一个业余所以这可能听起来愚蠢，但np.split（）将默认分割在相等的片所指定（为例如阵列： 4在上面的例子），如果它不能比它抛出一个错误。那么，为什么你需要找出这些信息，这不仅仅是4？ –

这可以在一个完全量化的方式，使用numpy_indexed包来完成（免责声明：我是它的作者）：

import numpy_indexed as npi 
unique_rows, row_counts = npi.count(mat1)

这应该比使用collections.Counter快是基本。

来源

2016-08-22 08:03:22

也许最简单的方法是使用np.unique并平整分割阵列比较他们的元组：

import numpy as np 
# Generate some sample data: 
a = np.random.uniform(size=(8,3)) 
# With repetition: 
a = np.r_[a,a] 
# Split a in 4 arrays 
s = np.asarray(np.split(a, 4)) 
s = [tuple(e.flatten()) for e in s] 
np.unique(s, return_counts=True)

备注：论证新的np.unique在1.9.0版本return_counts。

的其它纯numpy的溶液从that post

# Generate some sample data: 
In: a = np.random.uniform(size=(8,3)) 
# With some repetition 
In: a = r_[a,a] 
In: a.shape 
Out: (16,3) 
# Split a in 4 arrays 
In: s = np.asarray(np.split(a, 4)) 
In: print s 
Out: [[[ 0.78284847 0.28883662 0.53369866] 
     [ 0.48249722 0.02922249 0.0355066 ] 
     [ 0.05346797 0.35640319 0.91879326] 
     [ 0.1645498 0.15131476 0.1717498 ]] 

     [[ 0.98696629 0.8102581 0.84696276] 
     [ 0.12612661 0.45144896 0.34802173] 
     [ 0.33667377 0.79371788 0.81511075] 
     [ 0.81892789 0.41917167 0.81450135]] 

     [[ 0.78284847 0.28883662 0.53369866] 
     [ 0.48249722 0.02922249 0.0355066 ] 
     [ 0.05346797 0.35640319 0.91879326] 
     [ 0.1645498 0.15131476 0.1717498 ]] 

     [[ 0.98696629 0.8102581 0.84696276] 
     [ 0.12612661 0.45144896 0.34802173] 
     [ 0.33667377 0.79371788 0.81511075] 
     [ 0.81892789 0.41917167 0.81450135]]] 
In: s.shape 
Out: (4, 4, 3) 
# Flatten the array: 
In: s = asarray([e.flatten() for e in s]) 
In: s.shape 
Out: (4, 12) 
# Sort the rows using lexsort: 
In: idx = np.lexsort(s.T) 
In: s_sorted = s[idx] 
# Create a mask to get unique rows 
In: row_mask = np.append([True],np.any(np.diff(s_sorted,axis=0),1)) 
# Get unique rows: 
In: out = s_sorted[row_mask] 
# and count: 
In: for e in out: 
     count = (e == s).all(axis=1).sum() 
     print e.reshape(4,3), count 
Out:[[ 0.78284847 0.28883662 0.53369866] 
    [ 0.48249722 0.02922249 0.0355066 ] 
    [ 0.05346797 0.35640319 0.91879326] 
    [ 0.1645498 0.15131476 0.1717498 ]] 2 
    [[ 0.98696629 0.8102581 0.84696276] 
    [ 0.12612661 0.45144896 0.34802173] 
    [ 0.33667377 0.79371788 0.81511075] 
    [ 0.81892789 0.41917167 0.81450135]] 2

来源

2016-08-22 09:48:02 bougui

你在第一个例子中使用python 3吗？因为我从'a = r_ [a，a]'得到' 'NameError：名称'r_'没有被定义' – andandandand

@andandandand不，我不知道。这是我的错，我在'r_'之前忘记了'np'，这是一种快速构建数组的简单方法（参见：http://docs.scipy.org/doc/numpy/reference/generated/numpy.r_的.html）。我刚刚纠正了我的答案。 – bougui

numpy - count数组相等

回答

相关问题