2012-09-04 28 views
9

我也跟着在另一篇文章定义自相关函数的建议:有没有标准化输出的任何numpy autocorrellation函数?

def autocorr(x): 
    result = np.correlate(x, x, mode = 'full') 
    maxcorr = np.argmax(result) 
    #print 'maximum = ', result[maxcorr] 
    result = result/result[maxcorr]  # <=== normalization 

    return result[result.size/2:] 

,但最高值是不是“1.0”。因此我介绍了标记为“< ===归一化”的行

我试着用“时间序列分析”(Box-Jenkins)第2章的数据集来测试函数。 2.7在那本书中。但是我有以下几点:

enter image description here

谁有自相关的这个奇怪的不是预期的行为的解释?

加(2012年9月7日):

我成Python - 编程和做了以下内容:

from ClimateUtilities import * 
import phys 

# 
# the above imports are from R.T.Pierrehumbert's book "principles of planetary 
# climate" 
# and the homepage of that book at "cambridge University press" ... they mostly 
# define the 
# class "Curve()" used in the below section which is not necessary in order to solve 
# my 
# numpy-problem ... :) 
# 
import numpy as np; 
import scipy.spatial.distance; 

# functions to be defined ... : 
# 
# 
def autocorr(x): 
    result = np.correlate(x, x, mode = 'full') 
    maxcorr = np.argmax(result) 
    # print 'maximum = ', result[maxcorr] 
    result = result/result[maxcorr] 
    # 
    return result[result.size/2:] 

## 
# second try ... "Box and Jenkins" chapter 2.1 Autocorrelation Properties 
#            of stationary models 
## 
# from table 2.1 I get: 

s1 = np.array([47,64,23,71,38,64,55,41,59,48,71,35,57,40,58,44,\ 
       80,55,37,74,51,57,50,60,45,57,50,45,25,59,50,71,56,74,50,58,45,\ 
       54,36,54,48,55,45,57,50,62,44,64,43,52,38,59,\ 
       55,41,53,49,34,35,54,45,68,38,50,\ 
       60,39,59,40,57,54,23],dtype=float); 

# alternatively in order to test: 
s2 = np.array([47,64,23,71,38,64,55,41,59,48,71]) 

##################################################################################3 
# according to BJ, ch.2 
###################################################################################3 
print '*************************************************' 
global s1short, meanshort, stdShort, s1dev, s1shX, s1shXk 

s1short = s1 
#s1short = s2 # for testing take s2 

meanshort = s1short.mean() 
stdShort = s1short.std() 

s1dev = s1short - meanshort 
#print 's1short = \n', s1short, '\nmeanshort = ', meanshort, '\ns1deviation = \n',\ 
# s1dev, \ 
#  '\nstdShort = ', stdShort 

s1sh_len = s1short.size 
s1shX = np.arange(1,s1sh_len + 1) 
#print 'Len = ', s1sh_len, '\nx-value = ', s1shX 

########################################################## 
# c0 to be computed ... 
########################################################## 

sumY = 0 
kk = 1 
for ii in s1shX: 
    #print 'ii-1 = ',ii-1, 
    if ii > s1sh_len: 
     break 
    sumY += s1dev[ii-1]*s1dev[ii-1] 
    #print 'sumY = ',sumY, 's1dev**2 = ', s1dev[ii-1]*s1dev[ii-1] 

c0 = sumY/s1sh_len 
print 'c0 = ', c0 
########################################################## 
# now compute autocorrelation 
########################################################## 

auCorr = [] 
s1shXk = s1shX 
lenS1 = s1sh_len 
nn = 1 # factor by which lenS1 should be divided in order 
     # to reduce computation length ... 1, 2, 3, 4 
     # should not exceed 4 

#print 's1shX = ',s1shX 

for kk in s1shXk: 
    sumY = 0 
    for ii in s1shX: 
     #print 'ii-1 = ',ii-1, ' kk = ', kk, 'kk+ii-1 = ', kk+ii-1 
     if ii >= s1sh_len or ii + kk - 1>=s1sh_len/nn: 
      break 
     sumY += s1dev[ii-1]*s1dev[ii+kk-1] 
     #print sumY, s1dev[ii-1], '*', s1dev[ii+kk-1] 

    auCorrElement = sumY/s1sh_len 
    auCorrElement = auCorrElement/c0 
    #print 'sum = ', sumY, ' element = ', auCorrElement 
    auCorr.append(auCorrElement) 
    #print '', auCorr 
    # 
    #manipulate s1shX 
    # 
    s1shX = s1shXk[:lenS1-kk] 
    #print 's1shX = ',s1shX 

#print 'AutoCorr = \n', auCorr 
######################################################### 
# 
# first 15 of above Values are consistent with 
# Box-Jenkins "Time Series Analysis", p.34 Table 2.2 
# 
######################################################### 
s1sh_sdt = s1dev.std() # Standardabweichung short 
#print '\ns1sh_std = ', s1sh_sdt 
print '#########################################' 

# "Curve()" is a class from RTP ClimateUtilities.py 
c2 = Curve() 
s1shXfloat = np.ndarray(shape=(1,lenS1),dtype=float) 
s1shXfloat = s1shXk # to make floating point from integer 
        # might be not necessary 

#print 'test plotting ... ', s1shXk, s1shXfloat 
c2.addCurve(s1shXfloat) 
c2.addCurve(auCorr, '', 'Autocorr') 
c2.PlotTitle = 'Autokorrelation' 

w2 = plot(c2) 


########################################################## 
# 
# now try function "autocorr(arr)" and plot it 
# 
########################################################## 

auCorr = autocorr(s1short) 

c3 = Curve() 
c3.addCurve(s1shXfloat) 
c3.addCurve(auCorr, '', 'Autocorr') 
c3.PlotTitle = 'Autocorr with "autocorr"' 

w3 = plot(c3) 

# 
# well that should it be! 
# 
+2

找不到链接的图表:错误404 – halex

+0

该链接仍然无效。该图片位于不同的目录中,其中一个名称类似于“图片..有选择地”,但我不想编辑以包含链接,以防其他文件不用于公开发布。 – DSM

+0

感谢:链接,你可以找到图片是在www.ibk-consult.de/knowhow/ClimateChange/pictures有选择地公布/自相关* .png ...都似乎是错误的... 第二个(autocorrelation_1.png很奇怪... – kampmannpeine

回答

4

我不知道是什么问题。

矢量x的自相关在滞后0处必须为1,因为这只是L2范数的平方除以自身,即dot(x, x)/dot(x, x) == 1

一般情况下,对于任何滞后i, j in Z, where i != j单位缩放自相关dot(shift(x, i), shift(x, j))/dot(x, x)其中shift(y, n)是由n时间点Z转移载体y功能是一组,因为我们谈论的是执行整数(理论上这些滞后可以在实数集合中)。

我得到1.0作为用下面的代码(启动命令行作为$ ipython --pylab上)的最大值,如预期:

In[1]: n = 1000 
In[2]: x = randn(n) 
In[3]: xc = correlate(x, x, mode='full') 
In[4]: xc /= xc[xc.argmax()] 
In[5]: xchalf = xc[xc.size/2:] 
In[6]: xchalf_max = xchalf.max() 
In[7]: print xchalf_max 
Out[1]: 1.0 

当滞后0的自相关是不等于1的唯一时间是当x是零信号(全零)。

回答你的问题是没有,没有NumPy的功能,可自动为您执行标准化。

此外,即使这样做,您仍然需要根据您的预期输出进行检查,并且如果您能够说“是的,这会正确执行标准化”,那么我会假设您知道如何实施它你自己。

我会建议,它可能是你的算法错误地实现了他们的算法,虽然我不能确定,因为我不熟悉它。

5

所以你最初尝试的问题是你没有从你的信号中减去平均值。下面的代码应该工作:

timeseries = (your data here) 
mean = np.mean(timeseries) 
timeseries -= np.mean(timeseries) 
autocorr_f = np.correlate(timeseries, timeseries, mode='full') 
temp = autocorr_f[autocorr_f.size/2:]/autocorr_f[autocorr_f.size/2] 
iact.append(sum(autocorr_f[autocorr_f.size/2:]/autocorr_f[autocorr_f.size/2])) 

在我的例子temp是你感兴趣的变量;它是前向综合自相关函数。如果您想要对iact感兴趣的集成自相关时间。

+0

只是为了阐明这对未来做了什么读者:'autocorr_f [autocorr_f.size/2]'是方差,所以'temp'保存了自相关项的归一化值。 – amcnabb

+1

另外,这应该是'时间序列 - = np.mean(时间序列)'。当前版本'timeseries = np.array([x - 时间序列中的x的平均值])'效率低下且不太清晰。 – amcnabb

+0

你在哪里定义iact? –

相关问题