2015-05-27 19 views
3

表结构是:USER_ID,日期(我使用时间戳工作)MySQL的每日计数新用户VS返回的用户(队列分析)

例如

user id | Date (TS) 
A  | '2014-08-10 14:02:53' 
A  | '2014-08-12 14:03:25' 
A  | '2014-08-13 14:04:47' 
B  | '2014-08-13 04:04:47' 
... 

并为下一个本周我有

user id | Date (TS) 
A  | '2014-08-17 09:02:53'  
B  | '2014-08-17 10:04:47' 
B  | '2014-08-18 10:04:47' 
A  | '2014-08-19 10:04:22' 
C  | '2014-08-19 11:04:47' 
... 

,并为今天,我有

user id | Date (TS) 
A  | '2015-05-27 09:02:53'  
B  | '2015-05-27 10:04:47' 
C  | '2015-05-27 10:04:22' 
D  | '2015-05-27 17:04:47' 

我需要知道如何执行单个查询来查找从其活动一开始就是“返回”用户的用户数量。

预期结果:

date  | New user | returned User 
2014-08-10 | 1  | 0 
2014-08-11 | 0  | 0 
2014-08-12 | 0  | 1 (A was active on 08/11) 
2014-08-13 | 1  | 1 (A was active on 08/12 & 08/11) 
... 
2014-08-17 | 0  | 2 (A & B were already active) 
2014-08-18 | 0  | 1 
2014-08-19 | 1  | 1 
... 
2015-05-27 | 1  | 3 (D is a new user) 

#2一些长期搜索后,我发现这里https://meta.stackoverflow.com/users/107744/spencer7593提供的一些材料:Weekly Active Users for each day from log但我并没有继承他的查询更改输出我的预期结果。

感谢您的帮助

回答

3

假设你有一个日期表的地方(而使用T-SQL语法,因为我知道它更好的... ...),关键是要分别计算每个用户的MINDATE,计算出总在这一天的用户数,然后就宣告回归用户是谁没有新用户:

SELECT DateTable.Date, NewUsers, NumUsers - NewUsers AS ReturningUsers 
FROM 
DateTable 
    LEFT JOIN 
     (
     SELECT MinDate, COUNT(user_id) AS NewUsers 
     FROM (
       SELECT user_id, min(CAST(date AS Date)) as MinDate 
       FROM Table 
       GROUP BY user_id 
      ) A 
     GROUP BY MinDate 
     ) B ON DateTable.Date = B.MinDate 
    LEFT JOIN 
     (
     SELECT CAST(date AS Date) AS Date, COUNT(DISTINCT user_id) AS NumUsers 
     FROM Table 
     GROUP CAST(date AS Date) 
     ) C ON DateTable.Date = C.Date 
1

感谢斯蒂芬,我对他的查询,它工作得很好,即使它是一个做了简短的修复在大型数据库上消耗大量时间:

SELECT 
    DATE(Stats.Created), 
    NewUsers, 
    NumUsers - NewUsers AS ReturningUsers 
FROM 
    Stats 
LEFT JOIN 
    (
     SELECT 
      MinDate, 
      COUNT(user_id) AS NewUsers 
     FROM (
      SELECT 
       user_id, 
       MIN(DATE(Created)) as MinDate 
      FROM Stats 
      GROUP BY user_id 
     ) A 
     GROUP BY MinDate 
    ) B 
ON DATE(Stats.Created) = B.MinDate 
LEFT JOIN 
    (
     SELECT 
      DATE(Created) AS Date, 
      COUNT(DISTINCT user_id) AS NumUsers 
     FROM Stats 
     GROUP BY DATE(Created) 
    ) C 
ON DATE(Stats.Created) = C.Date 
GROUP BY DATE(Stats.Created)