2017-09-01 93 views
5

我有一个时间序列netCDF文件和时间变量具有如下典型的元数据:一个的NetCDF时间变量转换成的R日期对象

double time(time) ; 
      time:standard_name = "time" ; 
      time:bounds = "time_bnds" ; 
      time:units = "days since 1979-1-1 00:00:00" ; 
      time:calendar = "standard" ; 
      time:axis = "T" ; 

里面RI想将时间转换成R日期对象。我现在以硬连线的方式通过读取单位属性并分割字符串并使用第三个条目作为我的原点(因此假定间隔是“天”并且时间是00:00等)来实现此目的:

require("ncdf4") 
f1<-nc_open("file.nc") 
time<-ncvar_get(f1,"time") 
tunits<-ncatt_get(f1,"time",attname="units") 
tustr<-strsplit(tunits$value, " ") 
dates<-as.Date(time,origin=unlist(tustr)[3]) 

这个硬连线解决方案适用于我的具体示例,但我希望R中可能会有一个包,很好地处理UNIDATA netcdf数据约定的时间单位并将它们安全地转换为R日期对象?

+0

请注意,新建议和目前正在开发的真棒'stars'包将自动处理日期,请参阅第一篇博客文章中的示例:http://r-spatial.org/r/2017/11 /23/stars1.html – AF7

+0

啊,我忘了补充说'''包'似乎处理日期优雅。值得一试。 – AF7

+0

在我的回答中查看我的编辑示例 – AF7

回答

2

没有,我知道的。我有这个方便的功能,使用lubridate,这与你的基本相同。

getNcTime <- function(nc) { 
    require(lubridate) 
    ncdims <- names(nc$dim) #get netcdf dimensions 
    timevar <- ncdims[which(ncdims %in% c("time", "Time", "datetime", "Datetime", "date", "Date"))[1]] #find time variable 
    times <- ncvar_get(nc, timevar) 
    if (length(timevar)==0) stop("ERROR! Could not identify the correct time variable") 
    timeatt <- ncatt_get(nc, timevar) #get attributes 
    timedef <- strsplit(timeatt$units, " ")[[1]] 
    timeunit <- timedef[1] 
    tz <- timedef[5] 
    timestart <- strsplit(timedef[4], ":")[[1]] 
    if (length(timestart) != 3 || timestart[1] > 24 || timestart[2] > 60 || timestart[3] > 60 || any(timestart < 0)) { 
     cat("Warning:", timestart, "not a valid start time. Assuming 00:00:00\n") 
     warning(paste("Warning:", timestart, "not a valid start time. Assuming 00:00:00\n")) 
     timedef[4] <- "00:00:00" 
    } 
    if (! tz %in% OlsonNames()) { 
     cat("Warning:", tz, "not a valid timezone. Assuming UTC\n") 
     warning(paste("Warning:", timestart, "not a valid start time. Assuming 00:00:00\n")) 
     tz <- "UTC" 
    } 
    timestart <- ymd_hms(paste(timedef[3], timedef[4]), tz=tz) 
    f <- switch(tolower(timeunit), #Find the correct lubridate time function based on the unit 
     seconds=seconds, second=seconds, sec=seconds, 
     minutes=minutes, minute=minutes, min=minutes, 
     hours=hours,  hour=hours,  h=hours, 
     days=days,  day=days,  d=days, 
     months=months, month=months, m=months, 
     years=years,  year=years,  yr=years, 
     NA 
    ) 
    suppressWarnings(if (is.na(f)) stop("Could not understand the time unit format")) 
    timestart + f(times) 
} 

编辑:你也可能想看看ncdf4.helpers::nc.get.time.series

EDIT2:请注意,新提出的,目前在研究与开发真棒stars包会自动处理日期,请参阅the first blog post为例。

编辑3:另一种方法是直接使用units包,这是stars使用的。人们可以做这样的事情:(仍然没有正确处理日历,我不知道units可以)

getNcTime <- function(nc) { ##NEW VERSION, with the units package 
    require(units) 
    require(ncdf4) 
    options(warn=1) #show warnings by default 
    if (is.character(nc)) nc <- nc_open(nc) 
    ncdims <- names(nc$dim) #get netcdf dimensions 
    timevar <- ncdims[which(ncdims %in% c("time", "Time", "datetime", "Datetime", "date", "Date"))] #find (first) time variable 
    if (length(timevar) > 1) { 
     warning(paste("Found more than one time var. Using the first:", timevar[1])) 
     timevar <- timevar[1] 
    } 
    if (length(timevar)!=1) stop("ERROR! Could not identify the correct time variable") 
    times <- ncvar_get(nc, timevar) #get time data 
    timeatt <- ncatt_get(nc, timevar) #get attributes 
    timeunit <- timeatt$units 
    units(times) <- make_unit(timeunit) 
    as.POSIXct(time) 
} 
+1

注意:AF7的函数或SnowFrog的函数都不能正确处理'calendar = 365_day'属性,而'ncdf4.helpers :: nc.get.time.series'工作于365天日历! – tbc

2

我不能让@ AF7的功能与我的文件工作,所以我写了我自己。下面的函数创建一个POSIXct日期向量,从nc文件中读取开始日期,时间间隔,单位和长度。它适用于许多(但可能不是每个...)形状或形式的nc文件。

ncdate <- function(nc) { 
    ncdims <- names(nc$dim) #Extract dimension names 
    timevar <- ncdims[which(ncdims %in% c("time", "Time", "datetime", "Datetime", 
              "date", "Date"))[1]] # Pick the time dimension 
    ntstep <-nc$dim[[timevar]]$len 
    t <- ncvar_get(nc, timevar) # Extract the timestep count 
    tunits <- ncatt_get(nc, timevar, "units") # Extract the long name of units 
    tspace <- t[2] - t[1] # Calculate time period between two timesteps, for the "by" argument 
    tstr <- strsplit(tunits$value, " ") # Extract string components of the time unit 
    a<-unlist(tstr[1]) # Isolate the unit .i.e. seconds, hours, days etc. 
    uname <- a[which(a %in% c("seconds","hours","days"))[1]] # Check unit 
    startd <- as.POSIXct(gsub(paste(uname,'since '),'',tunits$value),format="%Y-%m-%d %H:%M:%S") ## Extract the start/origin date 
    tmulti <- 3600 # Declare hourly multiplier for date 
    if (uname == "days") tmulti =86400 # Declare daily multiplier for date 
    ## Rename "seconds" to "secs" for "by" argument and change the multiplier. 
    if (uname == "seconds") { 
     uname <- "secs" 
     tmulti <- 1 } 
    byt <- paste(tspace,uname) # Define the "by" argument 
    if (byt == "0.0416666679084301 days") { ## If the unit is "days" but the "by" interval is in hours 
    byt= "1 hour"      ## R won't understand "by < 1" so change by and unit to hour. 
    uname = "hours"} 
    datev <- seq(from=as.POSIXct(startd+t[1]*tmulti),by= byt, units=uname,length=ntstep) 
} 
+0

非常感谢 - 我借用了一些AF7代码想法,并将它们合并到我的R脚本中。我想知道这样的功能是否可以贡献给ncdf4软件包本身?标准内置这样的东西会很棒。 –

+0

请注意,这只适用于有规律的间隔时间,所有NetCDF都不一定适用。为什么我的功能不适合你?我会尽量让它更一般。 – AF7

+0

@ AF7好的时间重复有规律的时间间隔。我有一个错误信息到最后(对于'f'我认为)。当我回到电脑时,我会发布错误信息。 – SnowFrog