2013-12-16 60 views
4

我有一个df,其中包含名称和一些资格状态日期。根据时间,我想创建一个人有多少独特elig_end_dates的指标。这里是我的DF:R中每个ID的唯一值的累积计数

names date_of_claim elig_end_date 
1 tom 2010-01-01 2010-07-01 
2 tom 2010-05-04 2010-07-01 
3 tom 2010-06-01 2014-01-01 
4 tom 2010-10-10 2014-01-01 
5 mary 2010-03-01 2014-06-14 
6 mary 2010-05-01 2014-06-14 
7 mary 2010-08-01 2014-06-14 
8 mary 2010-11-01 2014-06-14 
9 mary 2011-01-01 2014-06-14 
10 john 2010-03-27 2011-03-01 
11 john 2010-07-01 2011-03-01 
12 john 2010-11-01 2011-03-01 
13 john 2011-02-01 2011-03-01 

这是我想要的输出:

names date_of_claim elig_end_date obs 
1 tom 2010-01-01 2010-07-01 1 
2 tom 2010-05-04 2010-07-01 1 
3 tom 2010-06-01 2014-01-01 2 
4 tom 2010-10-10 2014-01-01 2 
5 mary 2010-03-01 2014-06-14 1 
6 mary 2010-05-01 2014-06-14 1 
7 mary 2010-08-01 2014-06-14 1 
8 mary 2010-11-01 2014-06-14 1 
9 mary 2011-01-01 2014-06-14 1 
10 john 2010-03-27 2011-03-01 1 
11 john 2010-07-01 2011-03-01 1 
12 john 2010-11-01 2011-03-01 1 
13 john 2011-02-01 2011-03-01 1 

我发现这个职位有用R: Count unique values by category,但答案是给出一个单独的表,而不是被包含在DF。

我也试过这样:

df$ob = ave(df$elig_end_date, df$elig_end_date, FUN=seq_along) 

但是这创造了一个数,我真的只是想要一个指标。

预先感谢您

斯蒂芬的代码(这是不正确的代码 - 只是张贴作为一个学习点)产品

names date_of_claim elig_end_date ob 
1 tom 2010-01-01 2010-07-01 2 
2 tom 2010-05-04 2010-07-01 2 
3 tom 2010-06-01 2014-01-01 2 
4 tom 2010-10-10 2014-01-01 2 
5 mary 2010-03-01 2014-06-14 5 
6 mary 2010-05-01 2014-06-14 5 
7 mary 2010-08-01 2014-06-14 5 
8 mary 2010-11-01 2014-06-14 5 
9 mary 2011-01-01 2014-06-14 5 
10 john 2010-03-27 2011-03-01 4 
11 john 2010-07-01 2011-03-01 4 
12 john 2010-11-01 2011-03-01 4 
13 john 2011-02-01 2011-03-01 4 
+0

嗨,我发布了一个快速的答案,但我很困惑你的例子作为唯一值的数量s elig_end_date看起来错了吗?我误解了吗? –

+0

我将在上面发布代码的输出,以便您可以看到它。再次感谢您的输入! ;) – user2363642

+0

那么为什么在你想要的输出例子中Tom有1,1,2,2? –

回答

5

另一种可能使用ave

df$obs <- with(df, ave(elig_end_date, names, 
         FUN = function(x) cumsum(!duplicated(x)))) 

# names date_of_claim elig_end_date obs 
# 1 tom 2010-01-01 2010-07-01 1 
# 2 tom 2010-05-04 2010-07-01 1 
# 3 tom 2010-06-01 2014-01-01 2 
# 4 tom 2010-10-10 2014-01-01 2 
# 5 mary 2010-03-01 2014-06-14 1 
# 6 mary 2010-05-01 2014-06-14 1 
# 7 mary 2010-08-01 2014-06-14 1 
# 8 mary 2010-11-01 2014-06-14 1 
# 9 mary 2011-01-01 2014-06-14 1 
# 10 john 2010-03-27 2011-03-01 1 
# 11 john 2010-07-01 2011-03-01 1 
# 12 john 2010-11-01 2011-03-01 1 
# 13 john 2011-02-01 2011-03-01 1 
+0

非常感谢您 - 这项工作非常完美。 – user2363642