2017-10-11 63 views
4

我想根据他们本来可以完成的活动来计算客户流失率,而不是按日期流失,这是正常情况。我们有与特定主机连接的事件,在我的示例中,所有事件都由Alice托管,但可能是不同的主机。填补空白,基于事件

所有遵循特定事件的人都应被放置在一个类别中(新的,活跃的,搅动的和复活的)。

:第一次一个人跟随来自特定主机的事件。
活动:再次关注(并且还遵循了来自特定主持人的最后一个活动)。
搅动:追随者有机会跟随,但没有。
复活:已经搅动的追随者已经开始遵循先前遵循的主持人。

declare @events table (event varchar(50), host varchar(50), date date) 
declare @eventFollows table (event varchar(50), follower varchar(50)) 

insert into @events values ('e_1', 'Alice', GETDATE()) 
insert into @events values ('e_2', 'Alice', GETDATE()) 
insert into @events values ('e_3', 'Alice', GETDATE()) 
insert into @events values ('e_4', 'Alice', GETDATE()) 
insert into @events values ('e_5', 'Alice', GETDATE()) 

insert into @eventFollows values ('e_1', 'Bob') --new 
insert into @eventFollows values ('e_2', 'Bob') --active 
--Bob churned 
insert into @eventFollows values ('e_4', 'Megan') --new 
insert into @eventFollows values ('e_5', 'Bob') --resurrected 
insert into @eventFollows values ('e_5', 'Megan') --active 

select * from @events 
select * from @eventFollows 

预期的结果应该是这样的

select 'e_1', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Bob follows Alice event 
union all 
select 'e_2', 0 as New, 0 as resurrected, 1 as active, 0 as churned --Bob follows the next event that Alice host (considered as Active) 
union all 
select 'e_3', 0 as New, 0 as resurrected, 0 as active, 1 as churned --Bob churns since he does not follow the next event 
union all 
select 'e_4', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Megan follows Alice event 
union all 
select 'e_5', 0 as New, 1 as resurrected, 1 as active, 0 as churned --Second time (active) for Megan and Bob is resurrected 

我开始用类似下面的查询,但问题是,我不明白的是,追随者也没有的所有事件跟随(但可能已经跟随)。

select a.event, follower, date, 
    LAG (a.event,1) over (partition by a.host, ma.follower order by date) as lag, 
    LEAD (a.event,1) over (partition by a.host, ma.follower order by date) as lead, 
    LAG (a.event,1) over (partition by a.host order by date) as lagP, 
    LEAD (a.event,1) over (partition by a.host order by date) as leadP 
from @events a left join @eventFollows ma on ma.event = a.event order by host, follower, date 

任何想法?

+0

“搅动”后会发生什么?他们搅拌一次还是停止搅动? – gbn

+0

每个人或COUNT个人的标志是? – gbn

+0

当你流失后,你可以重新复活,然后你可以再次搅动。在我的例子中,鲍勃搅动(即将离开事件3和事件4),但在事件5复活。 – corpat

回答

1

这看起来可能有点间接的方式,但它可以通过在号码间隙检查,以检测岛屿:

;with nrsE as 
(
    select *, ROW_NUMBER() over (order by event) rnrE from @events 
), nrs as 
(
    select f.*,host, rnrE, ROW_NUMBER() over (partition by f.follower, e.host order by f.event) rnrF 
    from nrsE e 
    join @eventFollows f on f.event = e.event 
), f as 
(
    select host, follower, min(rnrE) FirstE, max(rnrE) LastE, ROW_NUMBER() over (partition by follower, host order by rnrE - rnrF) SeqNr 
    from nrs 
    group by host, follower, rnrE - rnrF --difference between rnr-Event and rnr-Follower to detect gaps 
), stat as --from the result above on there are several options. this example uses getting a 'status' and pivoting on it 
(
    select e.event, e.host, case when f.FirstE is null then 'No participants' when f.LastE = e.rnrE - 1 then 'Churned' when rnrE = f.FirstE then case when SeqNr = 1 then 'New' else 'Resurrected' end else 'Active' end Status 
    from nrsE e 
    left join f on e.rnrE between f.FirstE and f.LastE + 1 and e.host = f.host 
) 
select p.* from stat pivot(count(Status) for Status in ([New], [Resurrected], [Active], [Churned])) p 

最后2可以简化步骤,但通过这种方式获取'状态'可能可以在其他情况下重复使用

0

这符合你的期望的结果

SELECT 
    X.event, X.host, X.date, 
    IsNew = SUM(CASE WHEN X.FirstFollowerEvent = X.event THEN 1 ELSE 0 END), 
    IsActive = SUM(CASE WHEN X.lagFollowerEvent = X.lagEvent THEN 1 ELSE 0 END), 
    IsChurned = SUM(CASE WHEN X.follower IS NULL THEN 1 ELSE 0 END), 
    IsResurrected = SUM(CASE WHEN X.lagFollowerEvent <> X.lagEvent AND X.FirstFollowerEvent IS NOT NULL THEN 1 ELSE 0 END) 
FROM 
    (
    select 
     a.event, a.host, ma.follower, a.date, 
     FIRST_VALUE(a.event) over (partition by a.host, ma.follower order by a.date, a.event) as FirstFollowerEvent, 
     LAG (a.event,1) over (partition by a.host, ma.follower order by a.date, a.event) as lagFollowerEvent, 
     LAG (a.event,1) over (partition by a.host order by a.date, a.event) as lagEvent 
    FROM 
     @events a 
     LEFT join 
     @eventFollows ma on a.event = ma.event 
    ) X 
GROUP BY 
    X.event, X.host, X.date 
ORDER by 
    X.event, X.host, X.date 
+0

谢谢!但是,有一些问题。首先,我需要改变是否新款到'是否新款= SUM(CASE WHEN X.FirstFollowerEvent = X.event和X.follower IS NOT NULL THEN 1 ELSE 0 END),' 而且,如果我再补充一点库珀遵循E_2我预计他在e_3中流失,但e_3中的计数仅为1. 'insert into @eventFollows values('e_2','Cooper')' 类似的问题是,如果我将某人添加到我期望的e_1在e_2中流失,e_2的流失列仍然为零。例如。 'insert into @eventFollows values('e_1','Donald')' – corpat

+0

@corpat:我即将更新答案 – gbn