2017-08-02 54 views
2

DNN用的是h2o.deeplearning功能来执行。h2o.deeplearning预测误差中的R

最后,我使用的h2o.predict函数来执行测试数据预测。

但是当我尝试在可视化的方式来显示实际值与预测值,我得到了一个错误。这里是我的代码:

library("h2o") 
h2o.init(nthreads = -1, max_mem_size = "5G") 

credit<-read.csv("http://freakonometrics.free.fr/german_credit.csv", header=TRUE) 
F=c(1,2,4,5,7,8,9,10,11,12,13,15,16,17,18,19,20,21) 
for(i in F) credit[,i]=as.factor(credit[,i]) 
str(credit) 

library(caret) 
set.seed(1000) 
intrain<-createDataPartition(y=credit$Creditability, p=0.7, list=FALSE) 
train<-credit[intrain, ] 
test<-credit[-intrain, ] 


deep_train<-as.h2o(train,destination_frame = "deep_train") 
deep_test<-as.h2o(test,destination_frame = "deep_test") 


h2o.str(deep_train) 
h2o.str(deep_test) 

x<-names(train[,-1]) 
y<-"Creditability" 

deep_model<-h2o.deeplearning(x=x, y=y, 
          training_frame = deep_train, 
          activation = "RectifierWithDropout", 
          hidden=c(30,40,50), 
          epochs = 10, 
          input_dropout_ratio = 0.2, 
          hidden_dropout_ratios = c(0.5,0.5,0.5), 
          l1=1e-5 ,l2= 0, 
          rho = 0.99, epsilon = 1e-08, 
          loss = "CrossEntropy", 
          variable_importances = TRUE) 



pred<-h2o.predict(deep_model, newdata=deep_test) 

confusionMatrix(pred$predict, test$Creditability) 
Error in unique.default(x, nmax = nmax) : 
    invalid type/length (environment/0) in vector allocation 

如何可视化预测表?

回答

1

pred对象是H2OFrame。

> class(pred) 
[1] "H2OFrame" 
> head(pred) 
    predict  p0  p1 
1  1 0.1776320 0.8223680 
2  1 0.1959193 0.8040807 
3  1 0.2143592 0.7856408 
4  1 0.1561238 0.8438762 
5  1 0.1461881 0.8538119 
6  0 0.2978314 0.7021686 

confusionMatrix()功能是从插入符包,不知道做什么用的H2OFrame对象做的 - 这是错误的原因。 caret::confusionMatrix()函数期望第一个参数是类“R”中的向量。

如果你想使用caret::ConfusionMatrix()功能,那么你只需要在pred对象转换为正确的格式(需要把它从H2O集群存储为R内存中,然后将其转换为一个因素)。

> confusionMatrix(as.factor(as.data.frame(pred$predict)[,1]), test$Creditability) 

Confusion Matrix and Statistics 

      Reference 
Prediction 0 1 
     0 13 10 
     1 77 200 

       Accuracy : 0.71    
       95% CI : (0.6551, 0.7607) 
    No Information Rate : 0.7    
    P-Value [Acc > NIR] : 0.3793   

        Kappa : 0.123   
Mcnemar's Test P-Value : 1.484e-12  

      Sensitivity : 0.14444   
      Specificity : 0.95238   
     Pos Pred Value : 0.56522   
     Neg Pred Value : 0.72202   
      Prevalence : 0.30000   
     Detection Rate : 0.04333   
    Detection Prevalence : 0.07667   
     Balanced Accuracy : 0.54841   

     'Positive' Class : 0    

或者,你可以直接使用deep_model对象的h2o.confusionMatrix()功能。

+0

谢谢。但如何使用h2o.confusionMatrix? ? ? –

+1

请参阅该函数的R文件:'h2o.confusionMatrix(deep_model,newdata = deep_test)' –