2017-10-17 37 views
1

我有以下架构的答案表:累积计数超过日期,丢弃一些值

CREATE TABLE Answers 
    ([id] int, [analyst_id] int, [date] date); 

我要“累积计数”的分析师每月有多少答案了,丢弃前给出任何答案最后一个答案后的3个月。鉴于以下几点:

INSERT INTO Answers 
    ([id], [analyst_id], [date]) 
VALUES 
    (1, 1, '2017/01/01'), 
    (2, 1, '2017/02/01'), -- should be discarded 
    (3, 1, '2017/03/01'), -- should be discarded 
    (4, 1, '2017/05/01'), 
    (5, 1, '2017/06/01'), -- should be discarded 
    (6, 1, '2017/07/01'), -- should be discarded 
    (7, 1, '2017/08/01'), 
    (8, 2, '2017/01/01'), 
    (9, 2, '2017/04/01'), 
    (10, 1, '2018/02/01'), 
    (11, 2, '2018/03/01'); 

预期的结果是:

analyst_id | month-year | count 
------------------------------- 
1   | 01/2017 | 1 
1   | 02/2017 | 1 
1   | 03/2017 | 1 
1   | 04/2017 | 1 
1   | 05/2017 | 2 
1   | 06/2017 | 2 
1   | 07/2017 | 2 
1   | 08/2017 | 3 
1   | 09/2017 | 3 
1   | 10/2017 | 3 
1   | 11/2017 | 3 
1   | 12/2017 | 3 
2   | 01/2017 | 1 
2   | 02/2017 | 1 
2   | 03/2017 | 1 
2   | 04/2017 | 2 
2   | 05/2017 | 2 
2   | 06/2017 | 2 
2   | 07/2017 | 2 
2   | 08/2017 | 2 
2   | 09/2017 | 2 
2   | 10/2017 | 2 
2   | 11/2017 | 2 
2   | 12/2017 | 2 
1   | 01/2018 | 0 
1   | 02/2018 | 1 
1   | 03/2018 | 1 
2   | 01/2018 | 0 
2   | 02/2018 | 0 
2   | 03/2018 | 1 

DBMS是SQL Server 2012

编辑

我写这拨弄着我目前的半解决方案:http://sqlfiddle.com/#!6/c2e82e/5

每年,计数需要重置。

+2

最后一个答案在2017-08。我不明白为什么它会根据你的规则被丢弃。 –

+0

我的不好。糟糕的解释。 在回答(8月)之前,最后一次接受的是7月份。下一个可接受的答案将在十月份。 –

+1

您在查询中遇到什么问题? – Dinesh

回答

2

编辑

OK,对于更新的问题,你实际上需要做的是包含了所有的最小和最大日期之间的日期的“日期”表(这里的CTE称为“d”)你的答案表。然后,您可以基本上将结果加入到结果中,并使用窗口函数确定计数。

DECLARE @Answers TABLE (ID INT, Analyst_ID INT, [Date] DATE); 
INSERT @Answers (ID, Analyst_ID, [Date]) 
VALUES 
    (1, 1, '2017/01/01'), 
    (2, 1, '2017/02/01'), 
    (3, 1, '2017/03/01'), 
    (4, 1, '2017/05/01'), 
    (5, 1, '2017/06/01'), 
    (6, 1, '2017/07/01'), 
    (7, 1, '2017/08/01'), 
    (8, 2, '2017/01/01'), 
    (9, 2, '2017/04/01'), 
    (10, 1, '2018/02/01'), 
    (11, 2, '2018/03/01'); 

WITH CTE AS 
(
    SELECT A.Analyst_ID, [Date] = MIN(A.[Date]) 
    FROM @Answers AS A 
    GROUP BY A.Analyst_ID 
    UNION ALL 
    SELECT A.Analyst_ID, A.[Date] 
    FROM 
    (
     SELECT A.Analyst_ID, A.[Date], RN = ROW_NUMBER() OVER (PARTITION BY A.Analyst_ID ORDER BY A.ID) 
     FROM @Answers AS A 
     JOIN CTE 
      ON CTE.Analyst_ID = A.Analyst_ID 
      AND DATEADD(MONTH, 3, CTE.[Date]) <= A.[Date] 
    ) AS A 
    WHERE A.RN = 1 
), 

D AS -- List of dates between minimum and maximum date in table for each analyst ID. 
(
    SELECT [Date] = DATEADD(MONTH, RN, (SELECT MIN([Date]) FROM @Answers)), 
      A.Analyst_ID 
    FROM (SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1 FROM sys.objects) AS O 
    CROSS JOIN (SELECT DISTINCT Analyst_ID FROM @Answers) AS A 
    WHERE RN <= (SELECT DATEDIFF(MONTH, MIN([Date]), MAX([Date])) FROM @Answers) 
) 

SELECT D.Analyst_ID, 
     [Month-Year] = FORMAT(D.[Date], 'MM/yyyy'), 
     [Count] = CASE WHEN A.[Date] IS NULL THEN 0 ELSE DENSE_RANK() OVER (PARTITION BY D.Analyst_ID, DATEPART(YEAR, A.[Date]) ORDER BY A.[Date]) END 
FROM D 
OUTER APPLY (SELECT TOP 1 * FROM CTE WHERE CTE.[Date] <= D.[Date] AND DATEDIFF(YEAR, CTE.[Date], D.[Date]) = 0 AND CTE.Analyst_ID = D.Analyst_ID ORDER BY CTE.[Date] DESC) AS A 
ORDER BY D.Analyst_ID, D.[Date]; 
+0

我很抱歉...再次,我的坏。我认为这是正确的答案,但可以跳过几个月,在这种情况下,下个月(3个月后的任何日期)只会增加1.所以,如果我只有1月和10月有答案,1月份会是1,之后每个月也会有1,然后十月会有2. –

+0

@JonathasCosta我已经编辑了一个更新的答案,基于我认为你想实现的目标。从本质上讲,无论你做什么,你都需要用日期填补空白,这意味着创建某种日期表(针对每个分析师ID)。之后,这个过程与以前几乎一样。在这里我使用DENSE_RANK()和按年分区来进行计数。 – ZLK