2016-12-13 44 views
3

我试图产生一个类似于功能的代码片段,如动物园/ xts中的rollapply,但适用于我的需要。我使用一些非常简单的示例数据生成了代码,并且一切正常。但是现在我试图在edhec数据上运行它,我收到一个错误。我不清楚为什么,但认为这与if语句有关。有人能够诊断我为什么收到错误吗?如果声明错误/不应用如果语句

#rm(list=ls()) #Clear environment 
cat("\014") #CTRL + L 

library(xts) 
library(lubridate) 

is.even <- function(x) x %% 2 == 0 

roundUp <- function(x,to=2) 
{ 
    to*(x%/%to + as.logical(x%%to)) 
} 

functionTest <- function(data, window, slide){ 

    nyears_t = nyears(data) 

    #IF statement for non-even numbers only 
    if(is.even(nyears_t == FALSE)) { 
    nyears_t <- roundUp(nyears_t) 
    data_extend <- data 

    start_extend <- .indexyear(data)[length(data)]+ 1900 + 1 
    end_extend <- start_extend + length(data) - 1 
    index(data_extend) <- update(index(data),year=start_extend:end_extend) 

    data <- rbind(data, data_extend) 

    warning("WARNING! The function has looped to the start of the timeseries. The final list(s) 
      will contain years that do not exist in the dataset. Please modify.") 
    } 

    nslides = nyears_t/slide 

    #Matrix 
    year_1 = (.indexyear(data)[1]+1900) 

    start <- seq(from = year_1, by = slide, length.out = nslides) 
    end <- start + window - 1 

    mat <- matrix(c(start, end), ncol = 2, dimnames = list(c(1:nslides), c("start", "end"))) 

    #For loop 
    subsetlist <- vector('list') 

    for(i in 1:nslides){ 
    subset <- data[paste0(mat[i,1], "/", mat[i,2])] 
    subsetlist[[i]] <- subset 
    } 
    print(subsetlist) 
} 
这是当我正在上面的功能中使用的

样品的编号:

a <- seq(from = as.POSIXct("2000", format = "%Y"), to = as.POSIXct("2008", format = "%Y"), by = "year") 
a <- as.xts(1:length(a), order.by = a) 
a 

functionTest(data = a, window = 3, slide = 2) 

示例代码我测试上并接收一个错误:

> data(edhec, package = "PerformanceAnalytics") 
> edhec <- edhec[,1:3] 
> edhec <- edhec["/2007"] 
> head(edhec) 
      Convertible Arbitrage CTA Global Distressed Securities 
1997-01-31    0.0119  0.0393    0.0178 
1997-02-28    0..0298    0.0122 
1997-03-31    0.0078 -0.0021    -0.0012 
1997-04-30    0.0086 -0.0170    0.0030 
1997-05-31    0.0156 -0.0015    0.0233 
1997-06-30    0.0212  0.0085    0.0217 
> functionTest(data = edhec, window = 3, slide = 2) 
Show Traceback 

Rerun with Debug 
Error in start_extend:end_extend : NA/NaN argument 
> 

UPDATE:

代码现在运行以下更新到if语句(感谢Joshua Ulrich)(见下面的代码)。然而,if语句仍然存在问题 - 无论数据集中是否存在偶数或奇数年,它都会运行。虽然这不影响函数的准确性,但考虑到大数据集可能会出现问题。如果有人对此有任何想法,将不胜感激。否则,这已经超级了!干杯

if(is.even(nyears_t == FALSE)) { 
    nyears_t <- roundUp(nyears_t) 
    data_extend <- data 

    start_extend <- .indexyear(data)[nrow(data)] + 1900 + 1 
    end_extend <- start_extend + nyears(data) - 1 

    dates <- index(data) 
    tmp <- as.POSIXlt(dates) 
    tmp$year <- tmp$year + nyears(data) 
    dates2 <- as.POSIXct(tmp, tz = tz) 
    index(data_extend) <- dates2 

    data <- rbind(data, data_extend) 

    warning("WARNING! The function has looped to the start of the timeseries. The final list(s) 
      will contain years that do not exist in the dataset. Please modify.") 
    } 

回答

2

上以矩阵调用length(这是XTS /动物园对象的coredata是)给你元素的总数量(即底层矢量的长度)。您应该改用nrow

start_extend <- .indexyear(data)[nrow(data)] + 1900 + 1 
end_extend <- start_extend + nrow(data) - 1 

如果你不知道data是否将是一个矩阵或向量,那么你应该使用NROW,而不是nrow。在向量上调用nrow返回NULLNROW将返回length(x)如果x是向量。

+0

谢谢@Joshua,我正在梳理这个函数时发现了这个错误。我还注意到为了正确运行代码,我需要做一些进一步的修改(我已经将它添加到了问题中)。 这现在似乎已经按预期运行了代码。然而,仍然有一个小小的障碍。看起来,if语句现在正在稳定运行......我将edhec更改为奇数和偶数,并且if语句总是被应用。这表明if语句中仍然存在根本性错误。 – Visser

+0

@Visser:关于你的更新,我认为你需要'if(!is。即使(nyears_t))'。 –

0

我已经想出了具有所需效果的完整答案。感谢@Joshua的帮助 - 我不认为如果没有它,我可以修复代码。为了在大数据上运行它,我必须做一些额外的改变。

感兴趣的缘故,这是我的全部工作的代码(减去我的其他自定义功能):

bootOffset <- function(data, window, slide, tz = "GMT"){ 

    nyears_t = nyears(data) 

    #IF statement for non-even numbers only 
    if(is.even(nyears_t) == FALSE) { 
    nyears_t <- roundUp(nyears_t) 
    data_extend <- data 

    start_extend <- .indexyear(data)[nrow(data)] + 1900 + 1 
    end_extend <- start_extend + nyears(data) - 1 

    dates <- index(data) 
     tmp <- as.POSIXlt(dates); tmp$year <- tmp$year + nyears(data) 
    dates2 <- as.POSIXct(tmp, tz = tz) 

    index(data_extend) <- dates2 
    data <- rbind(data, data_extend) 
    } 

    nslides = nyears_t/slide 

    year_1 = (.indexyear(data)[1] + 1900) 

    #Matrix 
    start <- seq(from = year_1, by = slide, length.out = nslides); end <- start + window - 1 
    mat <- matrix(c(start, end), ncol = 2, dimnames = list(c(1:nslides), c("start", "end"))) 

    #For loop 
    subsetlist <- vector('list') 

    for(i in 1:nslides){ 
    subset <- window(data, 
        start = as.POSIXct(paste0(mat[i,1], "-01-01")), 
        end = as.POSIXct(paste0(mat[i,2], "-12-31"))) 

    subsetlist[[i]] <- subset 
    } 
    print(subsetlist) 
} 

并确认,这些结果出来为期望:

data(edhec, package = "PerformanceAnalytics") 
edhec <- edhec[,1:3] 
edhec08 <- edhec["/2008"] 
edhec07 <- edhec["/2007"] 

bootOffset(data = edhec08, #EVEN 
        window = 4, 
        slide = 3) 

bootOffset(data = edhec07, #ODD 
        window = 4, 
        slide = 3) 
> bootOffset.Check <- function(boot){ 
+ dates <- lapply(boot, year) 
+ dates <- lapply(dates, unique) 
+ dates <- lapply(dates, `length<-`, max(lengths(dates))) 
+ as.data.frame(dates, 
+ col.names = paste0("boot_", 1:length(boot))) 
+ 
+ } 
> 
> nyears(edhec08) 
[1] 12 
> bootOffset.Check(boot08) #EVEN number of years 
    boot_1 boot_2 boot_3 boot_4 
1 1997 2000 2003 2006 
2 1998 2001 2004 2007 
3 1999 2002 2005 2008 
4 2000 2003 2006  NA 
> 
> nyears(edhec07) 
[1] 11 
> bootOffset.Check(boot07) #ODD number of years 
    boot_1 boot_2 boot_3 boot_4 
1 1997 2000 2003 2006 
2 1998 2001 2004 2007 
3 1999 2002 2005 2008 
4 2000 2003 2006 2009 
>