我下面的数据集R:滚动/滑动中的R窗口和重复计数滑动天数
set.seed(1)
transaction_date <- sample(seq(as.Date('2016/01/01'), as.Date('2016/02/01'), by="day"), 24)
set.seed(1)
df <- data.frame("categ" = paste0("Categ",rep(1:2,12)), "prod" = sample(paste0("Prod",rep(seq(1:3),8))), customer_id = paste0("customer ",seq(1:24)),transaction_date=transaction_date)
df_ordered <- df[order(df$cate,df$prod,df$transaction_date,df$customer_id),]
df_ordered
categ prod customer_id transaction_date
1 Categ1 Prod1 customer 1 2016-01-09
3 Categ1 Prod1 customer 3 2016-01-18
19 Categ1 Prod1 customer 19 2016-01-28
7 Categ1 Prod1 customer 7 2016-01-29
5 Categ1 Prod2 customer 5 2016-01-06
23 Categ1 Prod2 customer 23 2016-01-07
13 Categ1 Prod2 customer 13 2016-01-14
9 Categ1 Prod2 customer 9 2016-01-16
15 Categ1 Prod2 customer 15 2016-01-20
21 Categ1 Prod2 customer 21 2016-01-24
11 Categ1 Prod3 customer 11 2016-01-05
17 Categ1 Prod3 customer 17 2016-01-31
10 Categ2 Prod1 customer 10 2016-01-02
20 Categ2 Prod1 customer 20 2016-01-11
24 Categ2 Prod1 customer 24 2016-01-23
16 Categ2 Prod1 customer 16 2016-02-01
12 Categ2 Prod2 customer 12 2016-01-04
4 Categ2 Prod2 customer 4 2016-01-27
22 Categ2 Prod3 customer 22 2016-01-03
14 Categ2 Prod3 customer 14 2016-01-08
2 Categ2 Prod3 customer 2 2016-01-12
18 Categ2 Prod3 customer 18 2016-01-15
8 Categ2 Prod3 customer 8 2016-01-17
6 Categ2 Prod3 customer 6 2016-01-25
我已经做了12天,从第一个窗口,独特的客户数超过(最小)在categ
,prod
上观察到的transaction_date。
在当前交易日期前12天滑动窗口,并计入该存储桶中的所有交易的计数。以下是我正在尝试创建的输出。我想避免为这个任务循环。
的可能的复制[通过data.table非等距相对窗运行总和加入(http://stackoverflow.com/questions/41007099/relative-windowed-running-sum-through-data-table- non-equi-join) – ExperimenteR