weka中新实例的分类

在我们的训练集中，我们执行了特征选择（例如CfsSubsetEval GreedyStepwise），然后使用分类器（例如J48）对实例进行分类。我们保存了Weka创建的模型。weka中新实例的分类

现在，我们想分类新的[未标记的]实例（在特征选择之前它仍然具有训练集的属性的原始数量）。我们是否正确地假设我们应该在这组新的[未标记的]实例中执行特征选择，以便我们可以使用已保存的模型重新评估它（以使训练和测试集兼容）？如果是，我们如何过滤测试集？

谢谢你的帮忙！

来源

2013-05-18 Dids

是的，测试和训练集必须具有相同数量的属性，并且每个属性必须对应相同的事物。所以你应该在分类之前从测试集中删除相同的属性（你从训练集中删除）。

来源

2013-05-20 08:23:22

我不认为你必须在测试集上进行特征选择。如果测试集已具有原始数量的属性，请上传它，然后在“预处理”窗口中手动删除在训练集文件中的功能选择过程中删除的所有属性。

来源

2013-07-29 09:29:36 PGreen

您必须将相同的过滤器应用于之前应用于训练集的测试集。您也可以使用WEKA API将相同的过滤器应用于测试集。

Instances trainSet = //get training set 
Instances testSet = //get testing set 
AttributeSelection attsel = new AttributeSelection();//apply feature selection on training data 
CfsSubsetEval ws = new CfsSubsetEval(); 
GreedyStepwise search = new GreedyStepwise(); 
attsel.setEvaluator(ws); 
attsel.setSearch(search); 
attsel.SelectAttributes(trainSet); 

retArr = attsel.selectedAttributes();//get indicies of selected attributes 

Filter remove = new Remove() //set up the filter for removing attributes 
remove.setAttributeIndicesArray(retArr); 
remove.setInvertSelection(true);//retain the selected,remove all others 
remove.setInputFormat(trainSet); 
trainSet = Filter.useFilter(trainSet, remove); 

//now apply the same filter to the testing set as well 
testSet = Filter.useFilter(testSet, remove); 

//now you are good to go!

来源

2014-03-03 06:06:07 abhinna11

weka中新实例的分类

回答

相关问题