2016-04-27 69 views
2

我有以下代码。我们假设在600轮和450轮之后停止优化。哪个模型将用于预测 - 第450轮之后或第600轮之后?R xgboost预测与early.stop.round

watchlist <- list(val=dval,train=dtrain) 

param <- list( objective   = "binary:logistic", 
       booster    = "gbtree", 
       eval_metric   = "auc", 
       eta     = 0.02, 
       max_depth   = 7, 
       subsample   = 0.6, 
       colsample_bytree = 0.7 
) 

clf <- xgb.train( params    = param, 
        data    = dtrain, 
        nrounds    = 2000, 
        verbose    = 0, 
        early.stop.round = 150, 
        watchlist   = watchlist, 
        maximize   = TRUE 
) 

preds <- predict(clf, test) 

回答

3

经过一番研究,我找到了自己的答案。预测将在第600轮后使用模型。如果想使用效果最好的机型,应该使用preds <- predict(clf, test, ntreelimit=clf$bestInd)