2015-01-09 105 views
0

当我使用kernlab包中的ksvm运行SVM时,我最终模型中的predict命令的所有输出都会缩放。我知道这是因为我启动了scaled = T,但我也知道在支持向量机建模中首选缩放您的数据。如何轻松告诉ksvm返回非缩放预测?如果没有,是否有办法将预测的缩放值操作为原始值?谢谢你,代码如下:如何让ksvm在缩放训练后预测非缩放值

svm1 <- ksvm(Y ~ 1 
      + X1 
      + X2 
      , data = data_nn 
      , scaled=T 
      , type = "eps-svr" 
      , kernel="anovadot" 
      , epsilon = svm1_CV2$bestTune$epsilon 
      , C = svm1_CV2$bestTune$C 
      , kpar = list(sigma = svm1_CV2$bestTune$sigma 
          , degree= svm1_CV2$bestTune$degree) 
      ) 

#Analyze Results 
data_nn$svm_pred <- predict(svm1) 

回答

2

从文档:

argument scaled: 
A logical vector indicating the variables to be scaled. If scaled is of length 1, 
the value is recycled as many times as needed and all non-binary variables are scaled. 
Per default, data are scaled internally (both x and y variables) to zero mean and 
unit variance. The center and scale values are returned and used for later predictions. 

1号液

让我们来看看下面的例子:

#make random data set 
y <- runif(100,100,1000) #the response variable takes values between 100 and 1000 
x1 <- runif(100,100,500) 
x2 <- runif(100,100,500) 
df <- data.frame(y,x1,x2) 

打字本:

svm1 <- ksvm(y~1+x2+x2,data=df,scaled=T,type='eps-svr',kernel='anovadot') 

> predict(svm1) 
       [,1] 
    [1,] 0.290848927 
    [2,] -0.206473246 
    [3,] -0.076651875 
    [4,] 0.088779924 
    [5,] 0.036257375 
    [6,] 0.206106048 
    [7,] -0.189082081 
    [8,] 0.245768175 
    [9,] 0.206742751 
[10,] -0.238471569 
[11,] 0.349902743 
[12,] -0.199938921 

进行缩放预测。

但如果你将其更改为按照从上面的文档如下:

svm1 <- ksvm(y~1+x2+x2,data=df,scaled=c(F,T,T,T),type='eps-svr',kernel='anovadot') 
#I am using a logical vector here so predictions will be raw data. 
#only the intercept x1 and x2 will be scaled using the above. 
#btw scaling the intercept (number 1 in the formula), actually eliminates the intercept. 

> predict(svm1) 
      [,1] 
    [1,] 601.2630 
    [2,] 599.7238 
    [3,] 599.7287 
    [4,] 599.9112 
    [5,] 601.6950 
    [6,] 599.8382 
    [7,] 599.8623 
    [8,] 599.7287 
    [9,] 601.8496 
[10,] 599.0759 
[11,] 601.7348 
[12,] 601.7249 

正如你可以看到这是原始数据的预测。

SOLUTION NO.2

如果要缩放模型中的变量y,你就需要自己unscale预测。

模型前:

运行模型前计算的平均值和标准:

y2 <- scale(y) 
y_mean <- attributes(y2)$'scaled:center' #the mean 
y_std <- attributes(y2)$'scaled:scale' #the standard deviation 

转换的预测,以生:

svm1 <- ksvm(y~1+x2+x2,data=df,scaled=T,type='eps-svr',kernel='anovadot') 

> predict(svm1) * y_std + y_mean 
      [,1] 
    [1,] 654.3604 
    [2,] 522.3578 
    [3,] 556.8159 
    [4,] 600.7259 
    [5,] 586.7850 
    [6,] 631.8674 
    [7,] 526.9739 
    [8,] 642.3948 
    [9,] 632.0364 
[10,] 513.8646 
[11,] 670.0349 
[12,] 524.0922 
[13,] 673.7202 

和你有原始的预测!

+0

真棒!非常感谢!非常明确的解释。 – gtnbz2nite 2015-01-12 16:44:31

+0

:)高兴得到了帮助! – LyzandeR 2015-01-12 16:45:45