2015-10-14 146 views
0

如果我们的婚姻数据帧看起来像这样[R聚合数据帧

Month Year marriage_counts 
1 Jan 2011    50 
2 Jan 2011    30 
3 Jan 2011    20 
4 Feb 2011    80 
5 Feb 2011    10 

和我们的业务数据看起来像这样

Month Year 
1 Jan 2011 
2 Jan 2011 
3 Jan 2011 
4 Feb 2011 
5 Feb 2011 

,应返回,看起来像

Month Year marriage_count 
1 Jan 2011   100 
2 Feb 2011    90 
数据帧

但我卡在这里..任何人都可以帮我吗?

+0

检查'合并()'函数 – HubertL

+0

也'dplyr :: GROUP_BY()' – HubertL

+0

我很抱歉,但你能澄清呢? – james

回答

1

在基R:

agg <- aggregate(marriage_counts ~ Month + Year, marriage, sum) 

通过Dplyr:

library(dplyr) 
df_marriage %>% group_by(Month, Year) %>% 
    summarise(marriage_count = sum(marriage_counts)) 

通过Data.table:

data.table::setDT(marriage)[, .(marriage_count = sum(marriage_counts)) , by = .(Month, Year)] 
1

使用{purrr}另一种选择。 slice_rows()相当于dplyr的group_by()

library(purrr) 
df_marriage <- data.frame(Month   = c("Jan", "Jan", "Jan", "May", "May"), 
          Year   = 2011, 
          marriage_counts = c(50, 30, 20, 80, 10)) 

df_marriage %>% slice_rows(c("Month", "Year")) %>% by_slice(map, sum)