2012-08-09 80 views
6

我正在考虑使用OpenCV中的K均值的实现,因为它说要快......cv2.kmeans使用Python中

我现在用的包CV2和功能k均值,

我无法理解的参数'在其参考文献中的描述:

Python: cv2.kmeans(data, K, criteria, attempts, flags[, bestLabels[, centers]]) → retval, bestLabels, centers 
samples – Floating-point matrix of input samples, one row per sample. 
clusterCount – Number of clusters to split the set by. 
labels – Input/output integer array that stores the cluster indices for every sample. 
criteria – The algorithm termination criteria, that is, the maximum number of iterations and/or the desired accuracy. The accuracy is specified as criteria.epsilon. As soon as each of the cluster centers moves by less than criteria.epsilon on some iteration, the algorithm stops. 
attempts – Flag to specify the number of times the algorithm is executed using different initial labelings. The algorithm returns the labels that yield the best compactness (see the last function parameter). 
flags – 
Flag that can take the following values: 
KMEANS_RANDOM_CENTERS Select random initial centers in each attempt. 
KMEANS_PP_CENTERS Use kmeans++ center initialization by Arthur and Vassilvitskii [Arthur2007]. 
KMEANS_USE_INITIAL_LABELS During the first (and possibly the only) attempt, use the user-supplied labels instead of computing them from the initial centers. For the second and further attempts, use the random or semi-random centers. Use one of KMEANS_*_CENTERS flag to specify the exact method. 
centers – Output matrix of the cluster centers, one row per each cluster center. 

flags[, bestLabels[, centers]])是什么意思?那他的一个:→ retval, bestLabels, centers

这里是我的代码:

import cv, cv2 
import scipy.io 
import numpy 

# read data from .mat file 
mat = scipy.io.loadmat('...') 
keys = mat.keys() 
values = mat.viewvalues() 

data_1 = mat[keys[0]] 
nRows = data_1.shape[1] 
nCols = data_1.shape[0] 
samples = cv.CreateMat(nRows, nCols, cv.CV_32FC1) 
labels = cv.CreateMat(nRows, 1, cv.CV_32SC1) 
centers = cv.CreateMat(nRows, 100, cv.CV_32FC1) 
#centers = numpy. 

for i in range(0, nCols): 
    for j in range(0, nRows): 
     samples[j, i] = data_1[i, j] 


cv2.kmeans(data_1.transpose, 
           100, 
           criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_MAX_ITER, 0.1, 10), 
           attempts=cv2.KMEANS_PP_CENTERS, 
           flags=cv2.KMEANS_PP_CENTERS, 
) 

我遇到这样的错误:

flags=cv2.KMEANS_PP_CENTERS, 
TypeError: <unknown> is not a numpy array 

我应该怎样理解参数列表和cv2.kmeans的使用情况如何?谢谢

回答

13

有关此功能的文档几乎不可能找到。我有点匆忙地写了下面的Python代码,但它在我的机器上工作。它使用不同的方法生成两个多变量高斯分布,然后使用cv2.kmeans()对它们进行分类。您可以参考this blog post了解一些参数。

手柄进口:

import cv 
import cv2 
import numpy as np 
import numpy.random as r 

产生一些随机点,并适当地塑造他们:之前和分类后

samples = cv.CreateMat(50, 2, cv.CV_32FC1) 
random_points = r.multivariate_normal((100,100), np.array([[150,400],[150,150]]), size=(25)) 
random_points_2 = r.multivariate_normal((300,300), np.array([[150,400],[150,150]]), size=(25)) 
samples_list = np.append(random_points, random_points_2).reshape(50,2) 
random_points_list = np.array(samples_list, np.float32) 
samples = cv.fromarray(random_points_list) 

剧情点:

blank_image = np.zeros((400,400,3)) 
blank_image_classified = np.zeros((400,400,3)) 

for point in random_points_list: 
    cv2.circle(blank_image, (int(point[0]),int(point[1])), 1, (0,255,0),-1) 

temp, classified_points, means = cv2.kmeans(data=np.asarray(samples), K=2, bestLabels=None, 
criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_MAX_ITER, 1, 10), attempts=1, 
flags=cv2.KMEANS_RANDOM_CENTERS) #Let OpenCV choose random centers for the clusters 

for point, allocation in zip(random_points_list, classified_points): 
    if allocation == 0: 
     color = (255,0,0) 
    elif allocation == 1: 
     color = (0,0,255) 
    cv2.circle(blank_image_classified, (int(point[0]),int(point[1])), 1, color,-1) 

cv2.imshow("Points", blank_image) 
cv2.imshow("Points Classified", blank_image_classified) 
cv2.waitKey() 

在这里,您可以看到原点:

Points before classification

这里是点,他们已被分类后: Points after classification

我希望这个答案可以帮助你,它不是一个完整的指南,K-手段,但至少你秀将如何将参数传递给OpenCV。

+0

它值得注意的是,本实施例中出现的工作比内所提供的蟒实施例更好地OpenCV文档 – Chris 2014-03-28 15:31:04

1

这里的问题是你的data_1.transpose不是一个numpy数组。

OpenCV 2.3.1及更高版本的python绑定除了numpy array作为图像/数组参数之外不会取任何内容。所以,data_1.transpose必须是一个numpy数组。

一般来说,在OpenCV中所有点的类型numpy.ndarray

例如。

array([[[100., 433.]], 
     [[157., 377.]], 
     . 
     . 
     [[147., 247.]], dtype=float32) 

其中阵列中的每个元素是

array([[100., 433.]], dtype=float32) 

和数组的元素是

array([100., 433.], dtype=float32)