2014-03-02 138 views
1

我有一个基于国家事件组的数据集。它看起来像这样:按日历时间间隔分组的时间段和ID

Data <- data.frame(EpStart=c("2010-01-01 00:00:00", "2009-01-01 00:00:00", "2009-01-01 00:00:00", "2006-01-01 00:00:00"), EpEnd=c("2011-01-01 00:03:00", "2013-01-01 00:00:00", "2012-01-01 00:00:00", "2011-01-01 00:00:00"), countryID=c("US","US", "CAN","CAN"))

我想要的数据拆分成基于由countryID分组年压延机间隔的数据帧。我需要将其转换成数据帧,看起来像这样:

CountryID Year Ongoing 
1   US 2009  1 
2   US 2010  2 
3   US 2011  1 
4   US 2012  1 
5  CAN 2006  1 
6  CAN 2007  1 
7  CAN 2008  1 
8  CAN 2009  2 
9  CAN 2010  2 
10  CAN 2011  1 

我试图通过@提供here,的例子中工作,但我没有找到如何分割数据时保持CountryID任何解决方案。

tmp <- do.call(c, apply(Data, 1, 
         function(x) head(seq(from = as.POSIXct(x[1]), 
              to = as.POSIXct(x[2]),by = "years"), 
             -1))) 

tmp <- sapply(split(tmp, format(tmp, format = "%Y")), length) 

Ongoing <- data.frame(Date=names(tmp), Ongoing = tmp, row.names=NULL) 

这将返回,但不通过CountryID将数据分割:

> Ongoing 
    Date Ongoing 
1 2006  1 
2 2007  1 
3 2008  1 
4 2009  3 
5 2010  4 
6 2011  2 
7 2012  1 

回答

0

我想omething这样看起来有效:

Data$Start = as.numeric(format(as.Date(Data$EpStart, "%Y-%m-%d"), "%Y")) 
Data$End = as.numeric(format(as.Date(Data$EpEnd, "%Y-%m-%d"), "%Y")) 
res = do.call(rbind, 
      lapply(split(Data, Data$countryID), 
       function(x) 
        as.data.frame(table(unlist(mapply(`:`, x$Start, x$End-1)))))) 
data.frame(CountryID = unlist(lapply(strsplit(row.names(res), ".", fixed = T), `[`, 1)), 
      Year = res$Var1, 
      Ongoing = res$Freq, stringsAsFactors = F) 
# CountryID Year Ongoing 
#1  CAN 2006  1 
#2  CAN 2007  1 
#3  CAN 2008  1 
#4  CAN 2009  2 
#5  CAN 2010  2 
#6  CAN 2011  1 
#7   US 2009  1 
#8   US 2010  2 
#9   US 2011  1 
#10  US 2012  1