我使用后勤风险来计算鸟巢的孵化成功率。我的数据集非常广泛,我有大约2,000个巢,每个巢都有一个唯一的ID(“ClutchID”),我需要计算给定巢的曝光天数(“曝光”),或者更简单地说, 。第一次和最后一天,我用下面的代码:计算R中组的日期差异
HS_Hatch$Exposure=NA
for(i in 2:nrow(HS_Hatch)){HS_Hatch$Exposure[i]=HS_Hatch$DateVisit[i]- HS_Hatch$DateVisit[i-1]}
其中HS_Hatch是我的数据集和DateVisit是实际日期唯一的问题是R代表的第一个日期计算的曝光值(不使。 。感)
我真正需要的是计算第一和最后日期为给定的离合器之间的区别,我也看着下面:
Exposure=ddply(HS_Hatch, "ClutchID", summarize,
orderfrequency = as.numeric(diff.Date(DateVisit)))
df %>%
mutate(Exposure = as.Date(HS_Hatch$DateVisit, "%Y-%m-%d")) %>%
group_by(ClutchID) %>%
arrange(Exposure) %>%
mutate(lag=lag(DateVisit), difference=DateVisit-lag)
我还在学习R,所以任何帮助将不胜感激。
编辑: 下面是数据的样本我使用
HS_Hatch <- structure(list(ClutchID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L
), DateVisit = c("3/15/2012", "3/18/2012", "3/20/2012", "4/1/2012",
"4/3/2012", "3/18/2012", "3/20/2012", "3/22/2012", "4/3/2012",
"4/4/2012", "3/22/2012", "4/3/2012", "4/4/2012", "3/18/2012",
"3/20/2012", "3/22/2012", "4/2/2012", "4/3/2012", "4/4/2012",
"3/20/2012", "3/22/2012", "3/25/2012", "3/27/2012", "4/4/2012",
"4/5/2012"), Year = c(2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
2012L), Survive = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -25L), .Names = c("ClutchID",
"DateVisit", "Year", "Survive"), spec = structure(list(cols = structure(list(
ClutchID = structure(list(), class = c("collector_integer",
"collector")), DateVisit = structure(list(), class = c("collector_character",
"collector")), Year = structure(list(), class = c("collector_integer",
"collector")), Survive = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("ClutchID", "DateVisit", "Year",
"Survive")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
欢迎来到Stack Overflow!你可以请包括数据,将提供给我们一个[可重现的例子](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? –
也许'summarize(exposure = diff(range(DateVisit)))'? –
@BenBolker说了些什么,只是补充说他的'summarise'行应该在你的'group_by'行之后。根据“DateVisit”的类别,您可以放弃第一个“mutate”行,或将“summarise”行更改为引用“Exposure”而不是“DateVisit”。 – rosscova