2014-11-25 32 views
3

我有R中的数据帧,看起来是这样的:如何查找其他列中值增加的最后日期?

person date   level 
Alex 2007-06-01 3 
Alex 2008-12-01 4 
Alex 2009-12-01 3 
Beth 2008-03-01 6 
Beth 2010-10-01 6 
Beth 2010-12-01 6 
Mary 2009-11-04 9 
Mary 2012-04-25 9 
Mary 2013-09-10 10 

我都首先由“人”和第二的“日期”来分类的。

我试图找出每个人最后一次增加“级别”的时间。理想情况下,输出将类似于:使用

person date 
Alex 2008-12-01 
Beth NA 
Mary 2013-09-10 

回答

8

dplyr

library(dplyr) 

dat %>% group_by(person) %>% 
    mutate(inc = c(F, diff(level) > 0)) %>% 
    summarize(date = last(date[inc], default = NA)) 

产量:

Source: local data frame [3 x 2] 

    person  date 
1 Alex 2008-12-01 
2 Beth  <NA> 
3 Mary 2013-09-10 
1

尝试data.table版本:

library(data.table) 
setDT(dat)[order(person),diff:=c(NA,diff(level)),by=person][diff>0,tail(.SD,1),by=person][,-c(3,4),with=F] 
    person  date 
1: Alex 2008-12-01 
2: Mary 2013-09-10 

如果娜也需要包括在内:

dd=setDT(dat)[order(person),diff:=c(NA,diff(level)),by=person][diff>0,tail(.SD,1),by=person][,-c(3,4),with=F] 
dd2 =data.frame(unique(ddt[!(person %in% dd$person),,]$person),NA) 
names(dd2) = c('person','date') 
rbind(dd, dd2) 
    person  date 
1: Alex 2008-12-01 
2: Mary 2013-09-10 
3: Beth   NA 
1

甲基-R版,采用数据帧DF:

sapply(levels(df$Person), function(p) { 
    s <- df[df$Person==p,] 
    i <- 1+nrow(s)-match(TRUE,rev(diff(s$Level)>0)) 
    ifelse(is.na(i), NA, as.character(s$Date[i])) 
}) 

产生命名矢量

 Alex   Beth   Mary 
"2008-12-01"   NA "2013-09-10" 

易于包裹此以产生所需的任何输出格式:

last.level.up <- function(df) { 
    data.frame(Date=sapply(levels(df$Person), function(p) { 
     s <- df[df$Person==p,] 
     i <- 1+nrow(s)-match(TRUE,rev(diff(s$Level)>0)) 
     ifelse(is.na(i), NA, as.character(s$Date[i])) 
    })) 
} 

last.level.up(df) 

      Date 
Alex 2008-12-01 
Beth  <NA> 
Mary 2013-09-10 
相关问题