关于Weka中的RandomTree

当我在RandomTree配置中观察到minNum字段时，我在weka身边玩耍。我阅读了“叶中实例的最小总重量”的描述。但是，我无法真正理解它的含义。关于Weka中的RandomTree

我玩过这个数字，我意识到当我增加它时，树的大小就会减少。我无法关联为什么会发生这种情况。

任何帮助/引用将不胜感激。

2011-01-30 Chander Shivdasani

这与叶节点上的最小实例数有关（默认情况下，在决策树中通常为2，如J48）。设置此参数的权重越高，树越通用，因为具有较少实例的许多树叶会生成过于细化的树结构。

下面是对iris数据集，这说明-M选项可能会如何影响结果树的大小两个例子：

$ weka weka.classifiers.trees.RandomTree -t iris.arff -i 

petallength < 2.45 : Iris-setosa (50/0) 
petallength >= 2.45 
| petalwidth < 1.75 
| | petallength < 4.95 
| | | petalwidth < 1.65 : Iris-versicolor (47/0) 
| | | petalwidth >= 1.65 : Iris-virginica (1/0) 
| | petallength >= 4.95 
| | | petalwidth < 1.55 : Iris-virginica (3/0) 
| | | petalwidth >= 1.55 
| | | | sepallength < 6.95 : Iris-versicolor (2/0) 
| | | | sepallength >= 6.95 : Iris-virginica (1/0) 
| petalwidth >= 1.75 
| | petallength < 4.85 
| | | sepallength < 5.95 : Iris-versicolor (1/0) 
| | | sepallength >= 5.95 : Iris-virginica (2/0) 
| | petallength >= 4.85 : Iris-virginica (43/0) 

Size of the tree : 17 

$ weka weka.classifiers.trees.RandomTree -M 6 -t iris.arff -i 

petallength < 2.45 : Iris-setosa (50/0) 
petallength >= 2.45 
| petalwidth < 1.75 
| | petallength < 4.95 
| | | petalwidth < 1.65 : Iris-versicolor (47/0) 
| | | petalwidth >= 1.65 : Iris-virginica (1/0) 
| | petallength >= 4.95 : Iris-virginica (6/2) 
| petalwidth >= 1.75 
| | petallength < 4.85 : Iris-virginica (3/1) 
| | petallength >= 4.85 : Iris-virginica (43/0) 

Size of the tree : 11

一点题外话，随机树木依靠套袋，这意味着有一个子采样属性（K随机选择在每个节点分割）;但是，与REPTree相反，没有修剪（就像在RandomForest中一样），所以最终可能会出现非常嘈杂的树木。

来源

2011-05-18 10:08:35 chl

关于Weka中的RandomTree

回答

相关问题