2012-07-18 150 views
7

我试图实现Chebyshev滤波器来平滑时间序列,但不幸的是,在数据序列中有NAs。R滤波器()处理NA

例如,

t <- seq(0, 1, len = 100)      
x <- c(sin(2*pi*t*2.3) + 0.25*rnorm(length(t)),NA, cos(2*pi*t*2.3) + 0.25*rnorm(length(t))) 

我用的切比雪夫滤波器:cf1 = cheby1(5, 3, 1/44, type = "low")

我试图过滤器系列排除NAS上的时间,但不会弄乱订单/位置。所以,我已经试过na.rm=T,但似乎没有这样的说法。 Then

z <- filter(cf1, x) # apply filter 

谢谢你们。

回答

1

您可以使用compelete.cases函数预先删除NAs。你也可以考虑输入缺失的数据。查看mtsdi或Amelia II软件包。

编辑:

下面是与RCPP的解决方案。这可能是有帮助的是速度是非常重要的:

require(inline) 
require(Rcpp) 
t <- seq(0, 1, len = 100) 
set.seed(7337) 
x <- c(sin(2*pi*t*2.3) + 0.25*rnorm(length(t)),NA, cos(2*pi*t*2.3) + 0.25*rnorm(length(t))) 
NAs <- x 
x2 <- x[!is.na(x)] 
#do something to x2 
src <- ' 
Rcpp::NumericVector vecX(vx); 
Rcpp::NumericVector vecNA(vNA); 
int j = 0; //counter for vx 
for (int i=0;i<vecNA.size();i++) { 
    if (!(R_IsNA(vecNA[i]))) { 
    //replace and update j 
    vecNA[i] = vecX[j]; 
    j++; 
    } 
} 
return Rcpp::wrap(vecNA); 
' 
fun <- cxxfunction(signature(vx="numeric", 
          vNA="numeric"), 
        src,plugin="Rcpp") 
if (identical(x,fun(x2,NAs))) 
    print("worked") 
# [1] "worked" 
+0

我只是想知道complete.case和na.omit是否一样。另外,由于我使用的是观测到的SST时间序列,我不确定输入缺失值是否是一个好主意。 – 2012-07-18 13:15:40

+0

希望这个更新解决了这个问题。 – chandler 2012-07-19 11:37:15

2

尝试使用x <- x[!is.na(x)]删除NA,然后运行过滤器。

+0

对不起,但如果我使用na.omit,它将大量增加订单,我只想在过滤器剩余NA之后的NA值,但所有其他非NA值都可以通过。 – 2012-07-18 12:52:07

+0

对不起,我不确定你在问什么。你想保持你的价值观在同一个位置,并且在你的数据中有空格吗? – 2012-07-18 12:54:50

+0

将tseries包中的na.remove()或na.remove.ts()做你想要的吗? – 2012-07-18 13:01:15

1

我不知道,如果ts对象可以有缺失值,但如果你只是想重新插入NA值,你可以使用?insertR.utils。可能有更好的方法来做到这一点。

install.packages(c('R.utils', 'signal')) 
require(R.utils) 
require(signal) 
t <- seq(0, 1, len = 100)      
set.seed(7337) 
x <- c(sin(2*pi*t*2.3) + 0.25*rnorm(length(t)), NA, NA, cos(2*pi*t*2.3) + 0.25*rnorm(length(t)), NA) 
cf1 = cheby1(5, 3, 1/44, type = "low") 
xex <- na.omit(x) 
z <- filter(cf1, xex) # apply 
z <- as.numeric(z) 
for (m in attributes(xex)$na.action) { 
    z <- insert(z, ats = m, values = NA) 
} 
all.equal(is.na(z), is.na(x)) 
?insert 
0

这里是一个函数,你可以用它来过滤一个信号,其中含有NAs。 NA被忽略而不是被零代替。

然后,您可以指定NA在滤波后信号的任意点可能采取的最大权重百分比。如果在特定点有太多的NAs(并且实际数据太少),则滤波后的信号本身将被设置为NA。

# This function applies a filter to a time series with potentially missing data 

filter_with_NA <- function(x, 
          window_length=12,       # will be applied centrally 
          myfilter=rep(1/window_length,window_length), # a boxcar filter by default 
          max_percentage_NA=25)      # which percentage of weight created by NA should not be exceeded 
{ 
    # make the signal longer at both sides 
    signal <- c(rep(NA,window_length),x,rep(NA,window_length)) 
    # see where data are present and not NA 
    present <- is.finite(signal) 

    # replace the NA values by zero 
    signal[!is.finite(signal)] <- 0 
    # apply the filter 
    filtered_signal <- as.numeric(filter(signal,myfilter, sides=2)) 

    # find out which percentage of the filtered signal was created by non-NA values 
    # this is easy because the filter is linear 
    original_weight <- as.numeric(filter(present,myfilter, sides=2)) 
    # where this is lower than one, the signal is now artificially smaller 
    # because we added zeros - compensate that 
    filtered_signal <- filtered_signal/original_weight 
    # but where there are too few values present, discard the signal 
    filtered_signal[100*(1-original_weight) > max_percentage_NA] <- NA 

    # cut away the padding to left and right which we previously inserted 
    filtered_signal <- filtered_signal[((window_length+1):(window_length+length(x)))] 
    return(filtered_signal) 
}