Python的构造在阵列矩阵迭代

from numpy import genfromtxt, linalg, array, append, hstack, vstack 

#Euclidean distance function 
def euclidean(v1, v2): 
    dist = linalg.norm(v1 - v2) 
    return dist 

#get the .csv files and eliminate heading and unused columns from test 
BMUs = genfromtxt('BMU3.csv', delimiter=',') 
data = genfromtxt('test.csv', delimiter=',') 
data = data[1:, :-2] 

i = 0 
for obj in data: 
    D = 0 
    for BMU in BMUs: 
     Dist = append(euclidean(obj, BMU[: -2]), BMU[-2:]) 
    D = hstack(Dist) 

Map = vstack(D) 

#iteration counter 
i += 1 
if not i % 1000: 
    print (i, ' of ', len(data)) 

print (Map)

我想要做的是：Python的构造在阵列矩阵迭代

以一个对象从数据
计算距离从BMU（欧几里德（OBJ，BMU [：-2] ）
追加到距离所述BMU阵列
创建一个包含所有的距离加上从数据对象（d = hstack（DIST））
创建一个长度等于数据中对象数量的矩阵数组。（Map = vstack（D））

问题在这里，或者至少是我认为的问题是，hstack和vstack将作为输入数组的元组而不是单个数组。这就像我试图使用它们，因为我使用列表.append（）列表，可悲的是我是一个初学者，我不知道如何做不同。

任何帮助将是真棒，谢谢提前:)

来源

2016-12-12 Bradipo Eremita

首先使用情况注：

相反的：

from numpy import genfromtxt, linalg, array, append, hstack, vstack

使用

import numpy as np 
.... 
data = np.genfromtxt(....) 
.... 
    np.hstack...

其次，留远离np.append。它太容易被误用。使用np.concatenate，这样您就可以充分感受它正在做什么。

列表append为增量工作

alist = [] 
for .... 
    alist.append(....) 
arr = np.array(alist)

==================

没有样本阵列（或至少形状）我更好猜猜。但（n，2）阵列听起来很合理。以彼此各对“点”的距离，我可以在嵌套列表理解收集的值：

In [121]: data = np.arange(6).reshape(3,2) 
In [122]: [[euclidean(d,b) for b in data] for d in data] 
Out[122]: 
[[0.0, 2.8284271247461903, 5.6568542494923806], 
[2.8284271247461903, 0.0, 2.8284271247461903], 
[5.6568542494923806, 2.8284271247461903, 0.0]]

和作出这样的一个数组：

In [123]: np.array([[euclidean(d,b) for b in data] for d in data]) 
Out[123]: 
array([[ 0.  , 2.82842712, 5.65685425], 
     [ 2.82842712, 0.  , 2.82842712], 
     [ 5.65685425, 2.82842712, 0.  ]])

与嵌套循环的等效：

alist = [] 
for d in data: 
    sublist=[] 
    for b in data: 
     sublist.append(euclidean(d,b)) 
    alist.append(sublist) 
arr = np.array(alist)

有没有这样做的方式没有循环，但让我们确保基本的Python循环方法首先工作。

===============

如果我想在data每一个元素（行）之间的差值（沿最后轴）和bmu每一个元素（或点击这里data），我可以使用数组广播。结果是（3,3,2）阵列：

In [130]: data[None,:,:]-data[:,None,:] 
Out[130]: 
array([[[ 0, 0], 
     [ 2, 2], 
     [ 4, 4]], 

     [[-2, -2], 
     [ 0, 0], 
     [ 2, 2]], 

     [[-4, -4], 
     [-2, -2], 
     [ 0, 0]]])

norm能够处理较大的二维阵列和接受一个axis参数。

In [132]: np.linalg.norm(data[None,:,:]-data[:,None,:],axis=-1) 
Out[132]: 
array([[ 0.  , 2.82842712, 5.65685425], 
     [ 2.82842712, 0.  , 2.82842712], 
     [ 5.65685425, 2.82842712, 0.  ]])

来源

2016-12-12 20:02:10 hpaulj

非常感谢你，会等待你的建议:) –

'BMU'和'data'的'shape'（和'dtype'）是什么？用样本复制和测试代码更容易。否则，我必须猜测并组成示例数组（如'data = np.arange（24）.reshape（12,2）'）。 – hpaulj

（243,7）BMUs.shape （19219,5）data.shape –

感谢你的帮助，我设法实现的伪代码，这里的最终方案：

import numpy as np 


def euclidean(v1, v2): 
    dist = np.linalg.norm(v1 - v2) 
    return dist 


def makeKNN(dataSet, BMUSet, k, fileOut, test=False): 
    # take input files 
    BMUs = np.genfromtxt(BMUSet, delimiter=',') 
    data = np.genfromtxt(dataSet, delimiter=',') 

    final = data[1:, :] 
    if test == False: 
     data = data[1:, :] 
    else: 
     data = data[1:, :-2] 

# Calculate all the distances between data and BMUs than reorder BMU with the distances information 

    dist = np.array([[euclidean(d, b[:-2]) for b in BMUs] for d in data]) 
    BMU_K = np.array([BMUs[np.argsort(d)] for d in dist]) 

    # median over the closest k BMU 
    Z = np.array([[np.sum(b[:k].T[5])/k] for b in BMU_K]) 

    # error propagation 
    Z_err = np.array([[np.sqrt(np.sum(np.power(b[:k].T[5], 2)))] for b in BMU_K]) 

    # Adding z estimates and errors to the data 
    final = np.concatenate((final, Z, Z_err), axis=1) 

    # print output file 
    np.savetxt(fileOut, final, delimiter=',') 
    print('So long, and thanks for all the fish')

非常感谢你，我希望这个代码将会帮助别人，将来别人:)

来源

2016-12-13 12:53:15

Python的构造在阵列矩阵迭代

回答

相关问题