WEKA分类类别的可能性

我想知道WEKA是否有方法输出一些分类的“最佳猜测”。WEKA分类类别的可能性

我的方案是：我用例如交叉验证对数据进行分类，然后在weka的输出中得到类似这样的结果：这些是对此实例进行分类的3个最佳猜测。我想要的就是，即使实例未正确分类，我也会得到3或5个最佳猜测的输出。

实施例：

类：A，B，C，d，E 实例：1 ... 10

和输出将是：实例1很可能90％是A类， 75％可能是B类，60％喜欢成为C类。

谢谢。

来源

2012-08-14 user1454263

我不知道你是否可以在本地做到这一点，但你可以得到每个班级的概率，对他们进行排序并取前三名。

你想要的功能是distributionForInstance(Instance instance)它返回一个double[]给每个类的概率。

来源

2012-08-14 20:57:19 Antimony

好的谢谢，我试了一下。 – user1454263 2012-08-17 12:36:21

不一般。所需的信息并不适用于所有分类器 - 在大多数情况下（例如，对于决策树），决策是清晰的（尽管可能不正确），而没有置信度值。你的任务需要能处理不确定性的分类器（比如朴素贝叶斯分类器）。

从技术上讲，最容易做的事情可能是训练模型，然后对单个实例进行分类，Weka应该为您提供所需的输出。一般来说，您也可以为一组实例执行此操作，但我不认为Weka提供了这种开箱即用的功能。您可能需要定制代码或通过API使用它（例如在R中）。

来源

2012-08-14 21:00:13

我打算通过API使用它 – user1454263 2012-08-17 12:37:04

当你计算实例的概率时，你到底该怎么做？

我已经为新实例here发布了我的PART规则和数据，但就手动计算而言，我不太确定如何执行此操作！感谢

编辑：现在计算：

私人浮子[] getProbDist（字符串分割）{

//取入的东西如（52/2），这意味着52个实例正确分类和2不正确地分类。

if(prob_dis.length > 2) 
     return null; 

    if(prob_dis.length == 1){ 
     String temp = prob_dis[0]; 
     prob_dis = new String[2]; 
     prob_dis[0] = "1"; 
     prob_dis[1] = temp; 
    } 

    float p1 = new Float(prob_dis[0]); 
    float p2 = new Float(prob_dis[1]); 
    // assumes two tags 
    float[] tag_prob = new float[2]; 

    tag_prob[1] = 1 - tag_prob[1]; 
    tag_prob[0] = (float)p2/p1; 

// returns double[] as being the probabilities 

return tag_prob;  
}

来源

2012-08-20 21:45:01 redrubia

Weka的API有一个名为Classifier.distributionForInstance（）的方法可用于获取分类预测分布。然后，您可以通过降低概率来对分布进行排序，以获得前N个预测。

下面是一个打印出来的函数：（1）测试实例的地面实况标签; （2）来自classifyInstance（）的预测标签;和（3）来自distributionForInstance（）的预测分布。我已经使用J48，但它应该与其他分类器一起使用。

输入参数是序列化的模型文件（您可以在模型训练阶段创建，应用-d选项）和ARFF格式的测试文件。

public void test(String modelFileSerialized, String testFileARFF) 
    throws Exception 
{ 
    // Deserialize the classifier. 
    Classifier classifier = 
     (Classifier) weka.core.SerializationHelper.read(
      modelFileSerialized); 

    // Load the test instances. 
    Instances testInstances = DataSource.read(testFileARFF); 

    // Mark the last attribute in each instance as the true class. 
    testInstances.setClassIndex(testInstances.numAttributes()-1); 

    int numTestInstances = testInstances.numInstances(); 
    System.out.printf("There are %d test instances\n", numTestInstances); 

    // Loop over each test instance. 
    for (int i = 0; i < numTestInstances; i++) 
    { 
     // Get the true class label from the instance's own classIndex. 
     String trueClassLabel = 
      testInstances.instance(i).toString(testInstances.classIndex()); 

     // Make the prediction here. 
     double predictionIndex = 
      classifier.classifyInstance(testInstances.instance(i)); 

     // Get the predicted class label from the predictionIndex. 
     String predictedClassLabel = 
      testInstances.classAttribute().value((int) predictionIndex); 

     // Get the prediction probability distribution. 
     double[] predictionDistribution = 
      classifier.distributionForInstance(testInstances.instance(i)); 

     // Print out the true label, predicted label, and the distribution. 
     System.out.printf("%5d: true=%-10s, predicted=%-10s, distribution=", 
          i, trueClassLabel, predictedClassLabel); 

     // Loop over all the prediction labels in the distribution. 
     for (int predictionDistributionIndex = 0; 
      predictionDistributionIndex < predictionDistribution.length; 
      predictionDistributionIndex++) 
     { 
      // Get this distribution index's class label. 
      String predictionDistributionIndexAsClassLabel = 
       testInstances.classAttribute().value(
        predictionDistributionIndex); 

      // Get the probability. 
      double predictionProbability = 
       predictionDistribution[predictionDistributionIndex]; 

      System.out.printf("[%10s : %6.3f]", 
           predictionDistributionIndexAsClassLabel, 
           predictionProbability); 
     } 

     o.printf("\n"); 
    } 
}

来源

2012-08-25 16:11:15 stackoverflowuser2010

WEKA分类类别的可能性

回答

相关问题