2012-05-15 73 views
4

我正在处理一些时间数据,我在将时差转换为年和月时遇到问题。按年和月计算年龄并融化数据

我的数据看起来或多或少像这样,

dfn <- data.frame(
Today = Sys.time(), 
DOB = seq(as.POSIXct('2007-03-27 00:00:01'), len= 26, by="3 day"), 
Patient = factor(1:26, labels = LETTERS)) 

首先,我减去出生(DOB)构成今天的数据(Today)的数据。

dfn$ageToday <- dfn$Today - dfn$DOB 

这给了我Time difference in days

dfn$ageToday 
Time differences in days 
    [1] 1875.866 1872.866 1869.866 1866.866 1863.866 
    [6] 1860.866 1857.866 1854.866 1851.866 1848.866 
[11] 1845.866 1842.866 1839.866 1836.866 1833.866 
[16] 1830.866 1827.866 1824.866 1821.866 1818.866 
[21] 1815.866 1812.866 1809.866 1806.866 1803.866 
[26] 1800.866 
attr(,"tzone") 
[1] "" 

这是我问题的第一部分出现的地方; 如何将此差异转换为年和月(四舍五入为几个月)?(即4.7,4.11等)

我阅读?difftime手册页和?format,但我没弄明白。

任何帮助,将不胜感激。

此外,我想我的融化最终目标,如果我尝试使用上面使用此命令的数据帧熔体,

require(plyr) 
require(reshape) 
mdfn <- melt(dfn, id=c('Patient')) 

我得到这个奇怪的警告,我还没有看到

Error in as.POSIXct.default(value) : 
    do not know how to convert 'value' to class "POSIXct" 

所以,我的第二个问题是, 如何创建时间差异我可以melt与我的POSIXct变量一起吗?如果我融化没有dfn$ageToday一切都像一个魅力。

谢谢,埃里克

回答

5

lubridate封装使日期和时间的工作,包括寻找时间差,很容易。

library("lubridate") 
library("reshape2") 

dfn <- data.frame(
    Today = Sys.time(), 
    DOB = seq(as.POSIXct('2007-03-27 00:00:01'), len= 26, by="3 day"), 
    Patient = factor(1:26, labels = LETTERS)) 

dfn$diff <- new_interval(dfn$DOB, dfn$Today)/duration(num = 1, units = "years") 

mdfn <- melt(dfn, id=c('Patient')) 
class(mdfn$value) # all values are coerced into numeric 

new_interval()函数计算两个日期之间的时间差。请注意,有一个功能today()可以替代您使用Sys.time。最后记下duration()函数,该函数创建一个标准时间段,您可以使用该时间段将标准时间间隔除以标准单位的长度,在此情况下,该时间单位为一年。

如果你想保留的TodayDOB的内容,那么你可能需要的一切转化为character第一,后来再改......

library("lubridate") 
library("reshape2") 

dfn <- data.frame(
    Today = Sys.time(), 
    DOB = seq(as.POSIXct('2007-03-27 00:00:01'), len= 26, by="3 day"), 
    Patient = factor(1:26, labels = LETTERS)) 

# Create standard durations for a year and a month 
one.year <- duration(num = 1, units = "years") 
one.month <- duration(num = 1, units = "months") 

# Calculate the difference in years as float and integer 
dfn$diff.years <- new_interval(dfn$DOB, dfn$Today)/one.year 
dfn$years <- floor(new_interval(dfn$DOB, dfn$Today)/one.year) 

# Calculate the modulo for number of months 
dfn$diff.months <- round(new_interval(dfn$DOB, dfn$Today)/one.month) 
dfn$months <- dfn$diff.months %% 12 

# Paste the years and months together 
# I am not using the decimal point so as not to imply this is 
# a numeric representation of the diference 
dfn$y.m <- paste(dfn$years, dfn$months, sep = '|') 

# convert Today and DOB to character so as to preserve them in melting 
dfn$Today <- as.character(dfn$Today) 
dfn$DOB <- as.character(dfn$DOB) 

# melt using string representation of difference between the two dates 
dfn2 <- dfn[,c("Today", "DOB", "Patient", "y.m")] 
mdfn2 <- melt(dfn2, id=c('Patient')) 

# alternative melt using numeric representation of difference in years 
dfn3 <- dfn[,c("Today", "DOB", "Patient", "diff.years")] 
mdfn3 <- melt(dfn3, id=c('Patient')) 
+0

感谢您回答我的问题。它几乎在那里,尽管它几乎没有几个月。它显示年龄为2.96年,我希望这是3年,小数点后的任何东西都应该大于.11(如果这有意义?) –

+0

@ eric-d-brean - 我已经扩展了我的第二个代码片段给你一些方法来逼近你的目标......从这里走到你想要的目标应该很容易。我给你几个选择。 – gauden