2016-10-03 158 views
1

我有一个data_framePOSIXct日期时间。我现在想要创建一个变量,将这些日期时间分为时间段:1 - [00:00:00,08:00:00),2 - [08:00:00,17:00:00) ,3 - [17:00:00,18:30:00],4 - [18:30:00,00:00:00]。R:截止日期时间

下面是一些样本数据:

df_times = data_frame(
    datetime = seq.POSIXt(
    from = as.POSIXct(strftime("2016-01-01 00:00:00", format = "%Y-%m-%d :%H:%M:%S")), 
    by = "min", 
    length.out = 100000 
), 
    value = rnorm(100000) 
) 

这里是预期输出:

> df_times 
# A tibble: 100,000 × 3 
       datetime  value band 
       <dttm>  <dbl> <dbl> 
1 2016-01-01 00:00:00 0.5855288  1 
2 2016-01-01 00:01:00 0.7094660  1 
3 2016-01-01 00:02:00 -0.1093033  1 
4 2016-01-01 00:03:00 -0.4534972  1 
5 2016-01-01 00:04:00 0.6058875  1 
6 2016-01-01 00:05:00 -1.8179560  1 
7 2016-01-01 00:06:00 0.6300986  1 
8 2016-01-01 00:07:00 -0.2761841  1 
9 2016-01-01 00:08:00 -0.2841597  1 
10 2016-01-01 00:09:00 -0.9193220  1 
# ... with 99,990 more rows 

我已经试过cut.POSIXt但坚持跟踪日期。理想的解决方案将使用dplyr::recodeforcats::

回答

3

这里是我直接想办法解决翻译问题的意图转化为代码:

set.seed(12345) 

# create a dataset 
df_times = data_frame(
    datetime = seq.POSIXt(
    from = as.POSIXct("2016-01-01 00:00:00", format = "%Y-%m-%d %H:%M:%S"), 
    by = "min", 
    length.out = 100000 
), 
    value = rnorm(100000) 
) %>% 
    mutate(
    time = times(format(datetime, "%H:%M:%S")), 
    cut(
     time, 
     breaks = times(c(
     "00:00:00", 
     "08:00:00", 
     "17:00:00", 
     "18:30:00", 
     "23:59:59" 
    )), 
     labels = c(
     "1", 
     "2", 
     "3", 
     "4" 
    ), 
     include.lowest = TRUE, 
     right = FALSE 
    ) 
) 
+1

'times'函数从哪里来? – mlevy

2

您可以创建一个hour列,然后切是:

df_times$hour = as.numeric(df_times$datetime) %% (24*60*60)/3600 
df_times$band = cut(df_times$hour, breaks=c(0,8,17,18.5,24), include.lowest=TRUE, 
        right=FALSE) 
+0

嗨eipi10,感谢这个,但我理想地寻找更优雅/可读的解决方案,因为我已经有了一个类似的解决方案,但工作起来很麻烦。 – tchakravarty

+0

我发布了一个我认为更加紧凑和富有表现力的答案 - 评论欢迎。 – tchakravarty