2016-12-15 142 views
1

我想在python中实现相当于auto.arima()函数R在Python中R的auto.arima()相当于

In R auto.arima函数将时间序列值作为输入计算ARIMA阶参数(p,d,q值)并拟合模型,因此不需要提供p,d,q值作为用户的输入。

我想在python(不调用auto.arima R)中使用相当于auto.arima函数来预测时间序列中的未来值。在下文中的时间序列执行auto.arima-蟒为40点和预测下一个6个值,然后由1点移动窗口并再次执行相同的过程。

以下是示例性数据:

value 
0 
2.584751 
2.884758 
2.646735 
2.882105 
3.267503 
3.94552 
4.70788 
5.384803 
54.77972 
62.87139 
78.68957 
112.7166 
155.0074 
170.8084 
196.1941 
237.4928 
254.9718 
175.0717 
217.3807 
244.7357 
274.4517 
304.6838 
373.3202 
345.6252 
461.2653 
443.5982 
472.3653 
469.3326 
506.8819 
532.1639 
542.2837 
514.9269 
528.0194 
540.539 
542.7031 
556.8262 
569.7132 
576.2339 
577.7212 
577.0873 
569.6199 
573.2445 
573.7825 
589.3506 

我试图写函数来计算使用AD富勒测试差分的顺序,通过有区别的时间序列(其差分原始时间序列作为每adfuller试验后变为静止结果)转换为ARMA顺序选择函数来计算P,Q顺序值。

此外使用这些值来传递给在Statsmodels华宇功能。但功能似乎不起作用。

import numpy as np 
import pandas as pd 
import statsmodels.api as sm 
from statsmodels.tsa.stattools import adfuller 
from statsmodels.tsa.stattools import acf, pacf 

def diff_terms(timeseries): 
    i=1 
    j=0 
    while i != 0: 
     dftest = adfuller(timeseries, autolag='AIC') 
     if dftest[0] <= dftest[4]["5%"]: 
      i = 0 
     else: 
      timeseries = np.diff(timeseries) 
      i = 1 
      j = j + 1 
    return j 

def p_q_values_estimator(timeseries): 
    p=0 
    q=0 
    lag_acf = acf(timeseries, nlags=20) 
    lag_pacf = pacf(timeseries, nlags=20, method='ols') 
    y=1.96/np.sqrt(len(timeseries)) 

    if lag_acf[0] < y: 
     for a in lag_acf: 
      if a < y: 
       q = q + 1 
       break 
    elif lag_acf[0] > y: 
     for c in lag_acf: 
      if c > y: 
       q = q + 1 
       break 

    if lag_pacf[0] < y: 
     for b in lag_pacf: 
      if b < y: 
       p = p + 1 
       break 
    elif lag_pacf[0] > y: 
     for d in lag_pacf: 
      if d > y: 
       p = p + 1 
       break 

    p_q=[p,q] 
    return(p_q) 

def p_q_values_estimator2(timeseries): 
    res = sm.tsa.arma_order_select_ic(timeseries, ic=['aic'], max_ar=5, max_ma=4,trend='nc') 
    return res.aic_min_order 

data1=[] 
data=pd.read_csv('ABC.csv') 
d_value=diff_terms(data.value) 
data1[:]=data[:] 
data = data[0:40] 

i=0 
while i < d_value: 
    data_diff = np.diff(data) 
    i = i+1 

p_q_values=p_q_values_estimator(data) 
p_value=p_q_values[0] 
q_value=p_q_values[1] 

p_q_values2=p_q_values_estimator2(data_diff) 
p_value2=p_q_values2[0] 
q_value2=p_q_values2[1] 


exogx = np.array(range(0,40)) 
fit2 = sm.tsa.ARIMA(np.array(data), (p_value, d_value, q_value), exog = exogx).fit() 
print(fit2.fittedvalues) 
pred2 = fit2.predict(start = 40, end = 45, exog = np.array(range(40,46))) 
print(pred2) 
plt.plot(fit2.fittedvalues) 
plt.plot(np.array(data)) 
plt.plot(range(40,45), np.array(pred2)) 
plt.show() 

错误 - 使用ARMA为了选择

p_q_values2=p_q_values_estimator2(data_diff) 
line 56, in p_q_values_estimator2 
res = sm.tsa.arma_order_select_ic(timeseries, ic=['aic'], max_ar=5, max_ma=4,trend='nc') 
File "C:\Python27\lib\site-packages\statsmodels\tsa\stattools.py", line 1052, in arma_order_select_ic min_res.update({i + '_min_order' : (mins[0][0], mins[1][0])}) 
IndexError: index 0 is out of bounds for axis 0 with size 0 

错误 - 在使用基于ACF PACF函数P的计算,Q为了

fit2 = sm.tsa.ARIMA(np.array(data), (p_value, d_value, q_value), exog = exogx).fit() 
File "C:\Python27\lib\site-packages\statsmodels\tsa\arima_model.py", line 1104, in fit 
callback, **kwargs) 
File "C:\Python27\lib\site-packages\statsmodels\tsa\arima_model.py", line 942, in fit 
armafit.mle_retvals = mlefit.mle_retvals 
AttributeError: 'LikelihoodModelResults' object has no attribute 'mle_retvals' 
+0

你见过这个:[auto.arima()相当于python](http://stackoverflow.com/questions/22770352/auto-arima-equivalent-for-python) –

+0

是的,但即使这种方法导致相同的错误。 AttributeError:'LikelihoodModelResults'对象没有属性'mle_retvals'。 – user245204

回答

0

瓦尔斯是我自己的事情,但你可以用pd.date_range创建自己的索引

rdata=ts(traindf.requests_per_active.values,frequency=12) 
#forecasts 
fit=forecast.auto_arima(rdata) 
forecast_output=forecast.forecast(fit,h=6,level=(95.0)) 
#convert forecasts to dataframe  
forecast_results=pd.Series(forecast_output[3], index=vals.index) 
lowerpi=pd.Series(forecast_output[4], index=vals.index) 
upperpi=pd.Series(forecast_output[5], index=vals.index) 
results = pd.DataFrame({'forecast' : forecast_results, 'lowerpi' : lowerpi, 'upperpi' : upperpi}) 
+0

预测模块也是您自己的模块?或者你从任何存储库下载? – user6608138

+0

我的不好 - 使用rpy2和导入器从Python导入“预测”包 – thedon

+0

感谢您的回复......但我怎么能使用auto_arima输出的统计包在python中......当你在python中使用相同的R apis时,你有没有观察到RMSE的改进... – user6608138