2014-02-21 33 views
0

我一直在搜索谷歌,无法弄清楚我做错了什么。我对python非常陌生,并试图在股票上使用scikit,但在尝试预测时出现错误“ValueError:矩阵未对齐”。LinearRegression Predict- ValueError:矩阵不对齐

import datetime 

import numpy as np 
import pylab as pl 
from matplotlib import finance 
from matplotlib.collections import LineCollection 

from sklearn import cluster, covariance, manifold, linear_model 

from sklearn import datasets, linear_model 

############################################################################### 
# Retrieve the data from Internet 

# Choose a time period reasonnably calm (not too long ago so that we get 
# high-tech firms, and before the 2008 crash) 
d1 = datetime.datetime(2003, 01, 01) 
d2 = datetime.datetime(2008, 01, 01) 

# kraft symbol has now changed from KFT to MDLZ in yahoo 
symbol_dict = { 
    'AMZN': 'Amazon'} 

symbols, names = np.array(symbol_dict.items()).T 

quotes = [finance.quotes_historical_yahoo(symbol, d1, d2, asobject=True) 
      for symbol in symbols] 

open = np.array([q.open for q in quotes]).astype(np.float) 
close = np.array([q.close for q in quotes]).astype(np.float) 

# The daily variations of the quotes are what carry most information 
variation = close - open 

######### 

pl.plot(range(0, len(close[0])-20), close[0][:-20], color='black') 

model = linear_model.LinearRegression(normalize=True) 
model.fit([close[0][:-1]], [close[0][1:]]) 

print(close[0][-20:]) 
model.predict(close[0][-20:]) 


#pl.plot(range(0, 20), model.predict(close[0][-20:]), color='red') 

pl.show() 

错误行是

model.predict(close[0][-20:]) 

我试过在列表中嵌套它。使它成为一个具有numpy的数组。任何我可以在谷歌上找到的东西,但我不知道我在这里做什么。

这个错误是什么意思,它为什么会发生?

+0

您也可以使用类似下面的内容为数据添加一个常量特性参数:X.add_constant(len(X)) –

回答

2

试图通过简单的线性回归预测股票价格? :^ |。无论如何,这是你需要改变什么:

In [19]: 

M=model.fit(close[0][:-1].reshape(-1,1), close[0][1:].reshape(-1,1)) 
In [31]: 

M.predict(close[0][-20:].reshape(-1,1)) 
Out[31]: 
array([[ 90.92224274], 
     [ 94.41875811], 
     [ 93.19997275], 
     [ 94.21895723], 
     [ 94.31885767], 
     [ 93.030142 ], 
     [ 90.76240203], 
     [ 91.29187436], 
     [ 92.41075928], 
     [ 89.0940647 ], 
     [ 85.10803717], 
     [ 86.90624508], 
     [ 89.39376602], 
     [ 90.59257129], 
     [ 91.27189427], 
     [ 91.02214318], 
     [ 92.86031126], 
     [ 94.25891741], 
     [ 94.45871828], 
     [ 92.65052033]]) 

记住,当你建立一个模型,Xy.fit方法应该具有的[n_samples,n_features]形状。这同样适用于.predict方法。

+0

我不确定要使用什么。你会推荐什么?我甚至不知道LinearRegression意味着什么,我只是想用我自己的数据来跟踪这些示例。 –