在R中创建季度子集

我有一个具有多年按时间顺序排列的数据的数据框。它还有其他数据，包括名称，数量和日期。我想将数据框分入每个季度，以测量相应季度的某些方面。例如，我想仅查看1月，2月和3月的收入。在R中创建季度子集

我已经确定的日期栏是一个时间序列：

class(data_frame$launch_date) 
>"Date"

我曾尝试这个代码，之前从第一季度/月获得的数据和包括三月：

subset(data_frame, format.Date(launch_date, "%m") <= "03")

但它不给我一个新的数据帧和这个响应：

<0 rows> (or 0-length row.names)

而且我有特里d

data_frame_q1 <- data.frame(data_frame, data_frame$launched < as.Date("2013-03-31"))

但我没有得到数据的子集框架。

对此提出建议？

来源

2014-05-21 Chef1075

使用'lubridate :: quarter'。 – Gregor

此外，只是收紧术语，你已经确定日期列是一个“日期”类，这是很好的，但时间序列是它自己的类，而不是你有什么。 – Gregor

你是接近的，但你需要学习如何正确子集的数据。

有几点意见：不要用subset。它有效，但你应该习惯更“R”的做事方式。用[将数据框子集。其次，如果函数的参数是Date，则不需要专门调用format.Date;你可以拨打format，R会为你选择正确的功能。

~~因此，您的功能不起作用的原因是因为您将character类型与<=进行比较，这是不允许的。将它们转换为数字，它将起作用：~~。我不知道你的原稿为什么不起作用。它为我工作。

# Generate some data 
set.seed(1) 
n<-100 
data_frame<-data.frame(launch_date=as.Date(Sys.time())+runif(n,1,365)) 

subset(data_frame,as.numeric(format(launch_date, "%m"))<=3)

但是，而是采用subset，尽量只使用[操作：

data_frame[as.numeric(format(data_frame$launch_date, "%m"))<=3,]

这只是意味着回到这里as.numeric(format(data_frame$launch_date, "%m"))<=3是TRUE的所有行。

如果你想你的数据拆分成宿舍，你可以做一个小的映射表：

然后就是merge到它：

head(merge(data_frame,quarters.map)) 
# month launch_date quarter 
# 1  1 2015-01-14  1 
# 2  1 2015-01-17  1 
# 3  1 2015-01-29  1 
# 4  1 2015-01-20  1 
# 5  1 2015-01-10  1 
# 6  1 2015-01-17  1

来源

2014-05-21 23:40:58 nograpes

“_you与<=不允许的字符类型进行比较_” - '“02”<=“03”'和'“04”<=“03”' – thelatemail

@thelatemail好吧，看起来我错了。 – nograpes

虽然它可能导致填满，例如'“05” thelatemail

只是把我的意见变成一个答案......

library(lubridate) 
subset(data_frame, quarter(launch_date) == 1) 

## Using @thelatemail's data 

> subset(data_frame, quarter(launch_date) == 1) 
    id launch_date 
1 1 2014-01-01 
2 2 2014-02-01 
3 3 2014-03-01

虽然我也搞不清楚什么地方错了你的方法。也许你没有得到正确的专栏名称？在开始时你使用launch_date，但在你的data_frame_q1你使用launched。

来源

2014-05-21 23:33:23 Gregor

我认为这个问题是针对'<= 3'而不是'== 3'顺便说一句。 – thelatemail

@thelatemail实际上'== 1'，根据宿舍。感谢您的支持！ – Gregor

似乎为我工作的，不知道你做了什么：

data_frame <- data.frame(
id=1:5, 
launch_date=seq.Date(as.Date("2014-01-01"),as.Date("2014-05-01"),by="1 month") 
) 

# id launch_date 
#1 1 2014-01-01 
#2 2 2014-02-01 
#3 3 2014-03-01 
#4 4 2014-04-01 
#5 5 2014-05-01 

class(data_frame$launch_date) 
#[1] "Date" 

subset(data_frame, format.Date(launch_date, "%m") <= "03") 

# id launch_date 
#1 1 2014-01-01 
#2 2 2014-02-01 
#3 3 2014-03-01

虽然它可能是更安全的工作与实际数字做：

subset(data_frame, as.numeric(format(launch_date, "%m")) <= 3) 

# id launch_date 
#1 1 2014-01-01 
#2 2 2014-02-01 
#3 3 2014-03-01

来源

2014-05-21 23:34:05 thelatemail

+1！花时间制作一个例子。 – agstudy

我会创建一个新的变量为宿舍。

data_frame$quarter <- quarters(data_frame$launch_date)

然后你可以子集您的数据是这样的：

subset(data_frame,quarter=='Q1')

使用@thelatemail数据：

data_frame 
    id launch_date quarter 
1 1 2014-01-01  Q1 
2 2 2014-02-01  Q1 
3 3 2014-03-01  Q1 
4 4 2014-04-01  Q2 
5 5 2014-05-01  Q2 

subset(data_frame,quarter=='Q1') 
    id launch_date quarter 
1 1 2014-01-01  Q1 
2 2 2014-02-01  Q1 
3 3 2014-03-01  Q1

来源

2014-05-21 23:34:30 agstudy

或all in one step'subset（data_frame，quarters（launch_date）==“Q1”）' – thelatemail

在R中创建季度子集

回答

相关问题