创建数据框是直截了当的,但我认为“X1_Center”列实际上是4列,因为你有4个不同的特征是对你的实际需求的最佳答案,我不认为它是。
然而,这是代码生成它
results = kmeans(iris[,c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width")], 3)
library(data.table)
data <- iris
setDT(data)
# creating cluster_ID
data[,cluster_ID:=results$cluster]
# creating the X1, [email protected], X3 columns
data[,':='(X1=0,X2=0,X3=0)]
data[cluster_ID==1,X1:=1]
data[cluster_ID==2,X2:=1]
data[cluster_ID==3,X3:=1]
# add the duplicated center cordinates
data <- cbind(data,rep(1,nrow(data)) %*% t.default(results$centers[1,]))
data <- cbind(data,rep(1,nrow(data)) %*% t.default(results$centers[2,]))
data <- cbind(data,rep(1,nrow(data)) %*% t.default(results$centers[3,]))
# setnames for the addded columns
setnames(data,c(names(data)[1:9],
paste0("X1_center_",names(data)[1:4]),
paste0("X2_center_",names(data)[1:4]),
paste0("X3_center_",names(data)[1:4])))
添加'clusterID'到'iris'有道理'虹膜< - cbind(光圈,群ID = $结果簇)',但不知道加'中心''虹膜'是有道理的。 – zx8754