2016-10-27 42 views
0

我在尝试使用脱字符包来训练数据集时遇到错误。错误如下...​​。我也有warnings()它们都是相同的,因为我使用以下代码为tuneGrid创建对象... grid <- expand.grid(cp = seq(0, 0.05, 0.005))。此代码创建了一个data.frame,其中有11行对应于我所拥有的11条警告。这里是警告... In eval(expr, envir, enclos) : model fit failed for Fold01: cp=0 Error in [.data.frame (m, labs) : undefined columns selected。看起来像cp没有任何东西。我可以去我的环境,看到网格对象和所有11行。我有搜索stackoverflow和我发现类似的问题,但由于这些功能有很多方法来调整它们,我还没有找到一个问题,解决我的问题。 这里是我的代码...使用脱字符号方法rpart进行错误培训R

require(rpart) 
require(rattle) 
require(rpart.plot) 
require(caret) 


setwd('~/Documents/Lipscomb/predictive_analytics/class4/') 
data <- read.csv(file = 'data.csv', 
       head = FALSE) 

data <- subset(data, select = -V1) 

colnames(data) <- c('diagnostic', 'm.radius', 'm.texture', 'm. perimeter', 'm.area', 'm.smoothness', 'm.compactness', 'm.concavity', 'm.concave.points', 'm.symmetry', 'm.fractal.dimension', 
        'se.radius', 'se.texture', 'se. perimeter', 'se.area', 'se.smoothness', 'se.copactness', 'se.concavity', 'se.concave.points', 'se.symmetry', 'se.fractal.dimension', 
        'w.radius', 'w.texture', 'w. perimeter', 'w.area', 'w.smoothness', 'w.copactness', 'w.concavity', 'w.concave.points', 'w.symmetry', 'w.fractal.dimension') 

str(data) 

set.seed(7) 
sample.train <- sample(1:nrow(data), nrow(data) * .8) 
sample.test <- setdiff(1:nrow(data), sample.train) 


data.train <- data[sample.train, ] 
data.test <- subset(data[sample.test, ], select = -diagnostic) 

rpart.tree <- rpart(diagnostic ~ ., data = data.train) 
out <- predict(rpart.tree, data.test, type = 'class') 
table(out, data[sample.test, ]$diagnostic) 

fancyRpartPlot(rpart.tree) 

temp <- rpart.control(xval = 10, minbucket = 2, minsplit = 4, cp = 0) 
dfit <- rpart(diagnostic ~ ., data = data.train, control = temp) 
fancyRpartPlot(dfit) 

fit.control <- trainControl(method = 'cv', number = 10) 
grid <- expand.grid(cp = seq(0, 0.05, 0.005)) 
trained.tree <- train(diagnostic ~ ., method = 'rpart', data = data.train, 
         metric = 'Accuracy', maximize = TRUE, 
         trControl = fit.control, tuneGrid = grid) 

回答

0

我已经找到了解决这一问题。我改变了我为我的colnames命名的方式。出于某种原因,命名colnames的原始代码使用train函数导致错误。这段代码解决了这个问题。

colnames(data) <- c('diagnostic', 'radius', 'texture', 'perimeter', 'area', 'smoothness', 'compactness', 'concavity', 'concavePoints', 'symmetry', 'fractalDimension', 
        'SeRadius', 'SeTexture', 'SePerimeter', 'SeArea', 'SeSmoothness', 'SeCopactness', 'SeConcavity', 'SeConcavePoints', 'SeSymmetry', 'SeFractalDimension', 
        'Wradius', 'Wtexture', 'Wperimeter', 'Warea', 'Wsmoothness', 'Wcopactness', 'Wconcavity', 'WconcavePoints', 'Wsymmetry', 'WfractalDimension')