2012-10-18 53 views
1

我使用此代码日期100S结合到各自的月:合并汇总变量为一个变量R中

cpkmonthly <- aggregate(mydf$AVG, na.rm=TRUE, list(month=months(as.Date(mydf$DATETIME))), mean) 

这是在R上的输出:

> cpkmonthly 
    month   x 
1 April 0.4583167 
2 August 0.4416660 
3 July 0.4436665 
4 June 0.4435551 
5 March 0.4654443 
6 May 0.4523338 

我要找为了将证书的月份合并到宿舍中的方式。

Jan-March = q1 
April-June = q2 
July-Sep = q3 
Oct-Dec = q4 

有没有办法做到这一点?

输出应该是这个样子:

> cpkquarterly 
    quarter   x 
1  q1 0.4583167 
2  q2 0.4416660 
3  q3 0.4436665 
4  q4 0.4435551 

回答

6

zoo包有一个函数来做到这一点:

library(zoo) 
as.yearqtr("2012-06", "%Y-%m") 

# [1] "2012 Q2" 
+0

谢谢!这是一个非常有用的软件包。还有一种方法可以声明从哪里开始q1?例如,我有时候希望q1在七月份开始。 – Jonny

+0

不是直接。您可以通过映射Q1> Q3等来更改输出。或者查看该函数的源代码并使用移位的输出编写自己的代码。 – Justin

1

目前还不清楚你想要什么:

> require(data.table) 
> cpkmonthly <- data.table(month=c("April", "August", "July","June","March","May"), 
+ x=c(0.4583167,0.4416660,0.4436665,0.4435551,0.4654443,0.4523338) 
+) 
> 
> cpkmonthly 
    month   x 
1: April 0.4583167 
2: August 0.4416660 
3: July 0.4436665 
4: June 0.4435551 
5: March 0.4654443 
6: May 0.4523338 
> 
> quart <- data.table(month=month.name,quarter=rep(1:4, each=3),key="month") 
> 
> ###if you just want each row assigned to a quarter: 
> quart[cpkmonthly] 
    month quarter   x 
1: April  2 0.4583167 
2: August  3 0.4416660 
3: July  3 0.4436665 
4: June  2 0.4435551 
5: March  1 0.4654443 
6: May  2 0.4523338 
> 
> ###if you want to aggregate in various ways: 
> 
> quart[cpkmonthly][,list(x.avg=mean(x),x.max=max(x),x.1=x[1]),by=quarter][order(quarter)] 
    quarter  x.avg  x.max  x.1 
1:  1 0.4654443 0.4654443 0.4654443 
2:  2 0.4514019 0.4583167 0.4583167 
3:  3 0.4426663 0.4436665 0.4416660 
0

我有类似的问题,但我的公司有一个日历,其中宿舍开始&在不规则日期结束。以下是我在自己的数据中解决这个问题的方法。请注意,我的数据集包含> 5MM行,因此我使用的是data.table而不是data.frame。

# My data is contained in the myDT data.table. 
# Dates are contained in the date column. 

require("data.table") 

Q1FY14 <- myDT[ which(date >= "2013-02-02" & date <= "2013-05-03"), ] 
Q2FY14 <- myDT[ which(date >= "2013-05-04" & date <= "2013-08-02"), ] 
Q3FY14 <- myDT[ which(date >= "2013-08-03" & date <= "2013-11-01"), ] 
Q4FY14 <- myDT[ which(date >= "2013-11-02" & date <= "2014-01-31"), ] 
Q1FY15 <- myDT[ which(date >= "2014-02-01" & date <= "2014-05-02"), ] 

# Create new vectors. 
Q1.14 <- rep("Q1 FY14", nrow(Q1FY14)) 
Q2.14 <- rep("Q2 FY14", nrow(Q2FY14)) 
Q3.14 <- rep("Q3 FY14", nrow(Q3FY14)) 
Q4.14 <- rep("Q4 FY14", nrow(Q4FY14)) 
Q1.15 <- rep("Q1 FY15", nrow(Q1FY15)) 

# Add each of my new vectors to their associate data.table. 
Q1FY14$quarter <- Q1.14 
Q2FY14$quarter <- Q2.14 
Q3FY14$quarter <- Q3.14 
Q4FY14$quarter <- Q4.14 
Q1FY15$quarter <- Q1.15 

# Bring it all together. 
newDT <- rbind(Q1FY14, Q2FY14) 
newDT <- rbind(newDT, Q3FY14) 
newDT <- rbind(newDT, Q4FY14) 
newDT <- rbind(newDT, Q1FY15) 

# Clean up data. 
rm(Q1FY14, Q2FY14, Q3FY14, Q4FY14, Q1FY15, Q1.14, Q2.14, Q3.14, Q4.14, Q1.15) 

这为每行添加了正确的四分之一。我需要一些其他小的调整来使它可用。

# Change the column order so that quarter appears next to date. 
setcolorder(newDT, c("date", "quarter", ...)) 

# Change the quarter column to factors. 
newDT$quarter <- factor(newDT$quarter)