如何调整SVM等级的参数？

我使用SVM Rank，它有多个参数，改变我得到各种结果的人。根据验证集上的最佳结果进行调整，是否有一些调整和获取最佳参数的机制？如何调整SVM等级的参数？

以下是在不同的参数：

Learning Options: 
    -c float -> C: trade-off between training error 
        and margin (default 0.01) 
    -p [1,2] -> L-norm to use for slack variables. Use 1 for L1-norm, 
        use 2 for squared slacks. (default 1) 
    -o [1,2] -> Rescaling method to use for loss. 
        1: slack rescaling 
        2: margin rescaling 
        (default 2) 
    -l [0..] -> Loss function to use. 
        0: zero/one loss 
        ?: see below in application specific options 
        (default 1) 
Optimization Options (see [2][5]): 
    -w [0,..,9] -> choice of structural learning algorithm (default 3): 
        0: n-slack algorithm described in [2] 
        1: n-slack algorithm with shrinking heuristic 
        2: 1-slack algorithm (primal) described in [5] 
        3: 1-slack algorithm (dual) described in [5] 
        4: 1-slack algorithm (dual) with constraint cache [5] 
        9: custom algorithm in svm_struct_learn_custom.c 
    -e float -> epsilon: allow that tolerance for termination 
        criterion (default 0.001000) 
    -k [1..] -> number of new constraints to accumulate before 
        recomputing the QP solution (default 100) 
        (-w 0 and 1 only) 
    -f [5..] -> number of constraints to cache for each example 
        (default 5) (used with -w 4) 
    -b [1..100] -> percentage of training set for which to refresh cache 
        when no epsilon violated constraint can be constructed 
        from current cache (default 100%) (used with -w 4) 
SVM-light Options for Solving QP Subproblems (see [3]): 
    -n [2..q] -> number of new variables entering the working set 
        in each svm-light iteration (default n = q). 
        Set n < q to prevent zig-zagging. 
    -m [5..] -> size of svm-light cache for kernel evaluations in MB 
        (default 40) (used only for -w 1 with kernels) 
    -h [5..] -> number of svm-light iterations a variable needs to be 
        optimal before considered for shrinking (default 100) 
    -# int  -> terminate svm-light QP subproblem optimization, if no 
        progress after this number of iterations. 
        (default 100000) 
Kernel Options: 
    -t int  -> type of kernel function: 
        0: linear (default) 
        1: polynomial (s a*b+c)^d 
        2: radial basis function exp(-gamma ||a-b||^2) 
        3: sigmoid tanh(s a*b + c) 
        4: user defined kernel from kernel.h 
    -d int  -> parameter d in polynomial kernel 
    -g float -> parameter gamma in rbf kernel 
    -s float -> parameter s in sigmoid/poly kernel 
    -r float -> parameter c in sigmoid/poly kernel 
    -u string -> parameter of user defined kernel

来源

2015-02-06 Bit Manipulator

这被称为grid search。我不知道你是否熟悉python和scikit-learn，但无论哪种方式，我认为their description and examples非常好，并且语言不可知。

基本上，您可以为每个参数指定一些您感兴趣的值（或随机抽样的时间间隔，请参阅随机搜索），然后针对每个设置组合使用交叉验证（通常为k fold cross validation）来计算模型对这些设置的效果。返回性能最好的组合（scikit-learn实际上可以返回组合的排名）。

请注意，这可能需要很长时间。根据您的问题，您应该自己确定一些参数。例如，对于文本分类，您应该选择线性内核，以解决您可能需要的其他问题rbf等。不要只是将所有内容都放在网格搜索中，决定尽可能多的参数，因为您可以使用自己的知识算法和手头的问题。

来源

2015-02-06 23:22:53 IVlad

谢谢@ | V | ad。你能否给出澄清“对于文本分类，你应该选择线性内核”？ – 2015-02-07 06:16:32

@BitManipulator - 我的意思是，在文献中众所周知，对于文本分类，这些实例在由一包词语模型产生的高维度中（几乎）可线性分离。所以线性内核表现最好，其他人尝试没有意义。不尝试多个内核意味着您只需调整一个参数即可，为您节省大量时间。 – IVlad 2015-02-07 10:00:45

如何调整SVM等级的参数？

回答

相关问题