-1
我想通过使用Jaccard索引(从sklearn.metrics导入jaccard_similarity_score)计算通过使用KMeans生成的集群之间的相似性。这些可能是一个包含特定值的矩阵:在[i,j]应该是群集i和j之间的相似度。我现在代码:jaccard_similarity_score引发ValueError:不支持连续多输出
from sklearn import datasets
from sklearn.cluster import KMeans
from sklearn.metrics import jaccard_similarity_score
iris = datasets.load_iris()
X = iris.data
kmeans = KMeans(n_clusters=3).fit(X)
labels = kmeans.labels_
for i in range(3):
for j in range(3):
print(jaccard_similarity_score(X[np.where(labels==i)], X[np.where(labels==j)]))
但我得到了以下错误:
Traceback (most recent call last):
File "<ipython-input-15-e7b8e4471987>", line 3, in <module>
print(jaccard_similarity_score(X[np.where(labels==i)], X[np.where(labels==j)]))
File "C:\Anaconda3\envs\p3\lib\site-packages\sklearn\metrics\classification.py", line 383, in jaccard_similarity_score
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "C:\Anaconda3\envs\p3\lib\site-packages\sklearn\metrics\classification.py", line 89, in _check_targets
raise ValueError("{0} is not supported".format(y_type))
ValueError: continuous-multioutput is not supported
这两个为我和j循环做什么?为什么在循环中调用jaccard_similarity得分? –
因为我想为每对集群计算jaccard索引。这些记录应该实际输入矩阵[i] [j] – user6808217