2016-03-21 64 views
-1

尝试这种代码:线性回归返回不同的结果综合参数

from sklearn import linear_model 
import numpy as np 

x1 = np.arange(0,10,0.1) 
x2 = x1*10 

y = 2*x1 + 3*x2 
X = np.vstack((x1, x2)).transpose() 

reg_model = linear_model.LinearRegression() 
reg_model.fit(X,y) 

print reg_model.coef_ 
# should be [2,3] 

print reg_model.predict([5,6]) 
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_ 
# perfectly at the expected value of 0 

print reg_model.score(X,y) 
# seems to be rather confident to be right 

的结果是

  • [0.31683168 3.16831683]
  • 20.5940594059
  • 0.0
  • 1.0

因此不是我所期望的 - 它们与用于合成数据的参数不同。这是为什么?

回答

0

您的问题在于解决方案的独特性,因为两个维度都是相同的(对一个维度应用线性变换不会在此模型的眼中产生独特的数据),您将获得无限数量的可能解决方案适合你的数据。将非线性变换应用于第二维,您将看到所需的输出。

from sklearn import linear_model 
import numpy as np 

x1 = np.arange(0,10,0.1) 
x2 = x1**2 
X = np.vstack((x1, x2)).transpose() 
y = 2*x1 + 3*x2 

reg_model = linear_model.LinearRegression() 
reg_model.fit(X,y) 
print reg_model.coef_ 
# should be [2,3] 

print reg_model.predict([[5,6]]) 
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_ 
# perfectly at the expected value of 0 

print reg_model.score(X,y) 

输出是

  • [ 2. 3.]
  • [ 28.]
  • -2.84217094304e-14
  • 1.0