该代码找到每个列表中的最小项目,我想将该数据点添加到基于它所来自的列表的 列表中。我也希望能够找到每个群集的均值。如何将元素添加到另一个列表中的列表?
import numpy as np
centroids = np.array([[3,44],[5,15],[99,12]])
dataPoints = np.array([[2,4],[17,4],[45,2],[45,7],[16,32],[32,14],[20,56],[68,33]])
def size(vector):
return np.sqrt(sum(x**2 for x in vector))
def distance(vector1, vector2):
return size(vector1 - vector2)
def distances(array1, array2):
lists = [[distance(vector1, vector2) for vector2 in array2] for vector1 in array1]
x = 1
for i in lists:
print ('Distance from Centroid {}:{}\n'.format(x,i))
x=x+1
print map(min, zip(*lists))
distances(centroids,dataPoints)
我的输出:
Distance from Centroid 1:[40.01249804748511, 42.379240200834182, 59.396969619669989, 55.97320787662612, 17.691806012954132, 41.725292090050132, 20.808652046684813, 65.924198895398035]
Distance from Centroid 2:[11.401754250991379, 16.278820596099706, 42.059481689626182, 40.792156108742276, 20.248456731316587, 27.018512172212592, 43.657759905886145, 65.520989003524662]
Distance from Centroid 3:[97.329337817535773, 82.389319696183918, 54.918120870983927, 54.230987451824994, 85.37564055396598, 67.029844099475568, 90.426765949026404, 37.443290453698111]
[11.401754250991379, 16.278820596099706, 42.059481689626182, 40.792156108742276, 17.691806012954132, 27.018512172212592, 20.808652046684813, 37.443290453698111]
另外所需输出:
Cluster 1: [[16,32],[20,56]]
Cluster 2: [[2,4],[17,4],[45,2],[45,7],[32,14]]
Cluster 3: [[68,33]]
List of means :[[18,44],[28.2,6.2],[68,33]]
对于这个例子,质心的量/集群中定义。 如果它们是动态的,并且需要动态创建集群列表会怎么样?
这些是*阵列*。不*列表*。或者至少,你似乎在混合两者。为什么不坚持列表? –
@ juanpa.arrivillaga如果我检查类型(列表),它将返回值'列表'。我知道我从numpy数组开始。在你的问题解决方案的概念中,使用哪一个都比另一个有优势? – cparks10
@是的,'lists'将是一个列表,因为它是分配*列表理解*的结果。不过,我不确定你想要做什么。 –