矢量化在numpy的

我试图做numpy的下面，而无需使用一个循环的操作：矢量化在numpy的

我有维度的矩阵X N * d和尺寸为N的向量y y保存整数范围从1到K.
我想获得大小为K * d的矩阵M，其中M [i，：] = np.mean（X [y == i，：]，0）

我可以在不使用循环的情况下实现吗？

随着循环，它会像这样。

import numpy as np 

N=3 
d=3 
K=2 

X=np.eye(N) 
y=np.random.randint(1,K+1,N) 
M=np.zeros((K,d)) 
for i in np.arange(0,K): 
    line=X[y==i+1,:] 
    if line.size==0: 
     M[i,:]=np.zeros(d) 
    else: 
     M[i,:]=mp.mean(line,0)

在此先感谢您。

来源

2016-05-15 popuban

是否K == N？ y的值是否独特？ –

如果你显示了一些代码，这将是很酷的。 – Bonifacio2

不，不。例如，如果K = 2，X = np.eye（3），Y = [1 2 1]，我想M是[[1/2 1/2]，[0 1 0]]。 – popuban

这解决了这个问题，但创建了一个中间K×N布尔矩阵，并且不使用内置的平均函数。在某些情况下，这可能导致性能变差或数字稳定性变差。我让类标签范围从0到K-1而不是1到K。

# Define constants 
K,N,d = 10,1000,3 

# Sample data 
Y = randint(0,K-1,N) #K-1 to omit one class to test no-examples case 
X = randn(N,d) 

# Calculate means for each class, vectorized 

# Map samples to labels by taking a logical "outer product" 
mark = Y[None,:]==arange(0,K)[:,None] 

# Count number of examples in each class  
count = sum(mark,1) 

# Avoid divide by zero if no examples 
count += count==0 

# Sum within each class and normalize 
M = (dot(mark,X).T/count).T 

print(M, shape(M), shape(mark))

来源

2016-05-15 11:42:18 MRule

代码的基本收集特定的行关闭X和加入他们，我们有一个与NumPy在np.add.reduceat内置。因此，以此为焦点，以矢量化方式解决问题的步骤可能如下所列 -

# Get sort indices of y 
sidx = y.argsort() 

# Collect rows off X based on their IDs so that they come in consecutive order 
Xr = X[np.arange(N)[sidx]] 

# Get unique row IDs, start positions of each unique ID 
# and their counts to be used for average calculations 
unq,startidx,counts = np.unique((y-1)[sidx],return_index=True,return_counts=True) 

# Add rows off Xr based on the slices signified by the start positions 
vals = np.true_divide(np.add.reduceat(Xr,startidx,axis=0),counts[:,None]) 

# Setup output array and set row summed values into it at unique IDs row positions 
out = np.zeros((K,d)) 
out[unq] = vals

来源

2016-05-15 11:45:02 Divakar

矢量化在numpy的

回答

相关问题