2016-08-05 141 views
7

我有这样从非规范化表

RepID|Role|Status|StartDate |EndDate | 
-----|----|------|----------|----------| 
10001|R1 |Active|01/01/2015|01/31/2015| 
-----|----|------|----------|----------| 
10001|R1 |Leavee|02/01/2015|02/12/2015| 
-----|----|------|----------|----------| 
10001|R1 |Active|02/13/2015|02/28/2015| 
-----|----|------|----------|----------| 
10001|R2 |Active|03/01/2015|03/18/2015| 
-----|----|------|----------|----------| 
10001|R2 |Leave |03/19/2015|04/10/2015| 
-----|----|------|----------|----------| 
10001|R2 |Active|04/11/2015|05/10/2015| 
-----|----|------|----------|----------| 
10001|R1 |Active|05/11/2015|06/13/2015| 
-----|----|------|----------|----------| 
10001|R1 |Leave |06/14/2015|12/31/9998| 
-----|----|------|----------|----------| 

我在寻找这样的输出在我的表中的数据标准化数据,

RepID|Role|StartDate |EndDate | 
-----|----|----------|----------| 
10001|R1 |01/01/2015|02/28/2015| 
-----|----|----------|----------| 
10001|R2 |03/01/2015|05/10/2015| 
-----|----|----------|----------| 
10001|R1 |05/11/2015|12/31/9998| 
-----|----|----------|----------| 

每当只有角色的变化发生,我需要捕获开始和结束日期。我尝试了不同的方式,但无法获得输出。

任何帮助表示赞赏。

下面是我试图与SQL,但它不帮助,

SELECT T1.RepID, T1.Role, Min(T1.StartDate)  AS StartDate,  Max(T1.EndDate) AS EndDate 
FROM 
(SELECT rD1.RepID, rD1.Role, rD1.StartDate, rD1.EndDate 
FROM repDetails rD1 
INNER JOIN repDetails rD2 
    ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate)  AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL)   OR (rD2.Role = '' AND rD1.Role = '')) 

UNION 

SELECT rD2.RepID, rD2.Role, rD2.StartDate, rD2.EndDate 
FROM repDetails rD1 
INNER JOIN repDetails rD2 
    ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate)  AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL)   OR (rD2.Role = '' AND rD1.Role = '')) 
    ) T1 
GROUP BY T1.RepID, T1.Role 

UNION 

SELECT EP.RepID, EP.Role AS DataValue, EP.StartDate, EP.EndDate 
FROM repDetails EP 
LEFT OUTER JOIN 
(SELECT rD1.RepID, rD1.Role, rD1.StartDate, rD1.EndDate 
FROM repDetails rD1 
INNER JOIN repDetails rD2 
    ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate)  AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL)   OR (rD2.Role = '' AND rD1.Role = '')) 

UNION 

SELECT rD2.RepID, rD2.Role , rD2.StartDate, rD2.EndDate 
FROM repDetails rD1 
INNER JOIN repDetails rD2 
    ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate)  AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL)   OR (rD2.Role = '' AND rD1.Role = '')) 
    ) T1 
ON EP.RepID = T1.RepID AND EP.StartDate = T1.StartDate 
WHERE T1.RepID IS NULL 
+2

什么是你试过的方法呢?输出是什么? – dbmitch

+0

这对于一个基本的查询来说会很棘手 - 也许这里的SQL大师可以做到这一点,但使用存储过程会非常简单。 – dbmitch

+0

我无法在应用程序中使用任何存储过程。我尝试了MAX和MIN函数,就像下面的SQL一样。 – Naveen

回答

2

这里的关键是找出连续行,直到角色的转变。这可以通过使用lead函数和其他一些逻辑将所有前面的行分类到同一组中来比较下一行的角色来完成。

将它们分组后,您只需使用minmax即可获取开始日期和结束日期。

with groups as (
select x.* 
,case when grp = 1 then 0 else 1 end + sum(grp) over(partition by repid order by startdate) grps 
from (select t.* 
     ,case when lead(role) over(partition by repid order by startdate) = role then 0 else 1 end grp 
     from t) x 
) 
select distinct repid,role 
,min(startdate) over(partition by repid,grps) startdt 
,max(enddate) over(partition by repid,grps) enddt 
from groups 
order by 1,3 

Sample demo

+0

谢谢你VKP,这很好!我不知道过多的功能! – Naveen

+0

@VKP,你为什么在第二条语句中使用select distinct/over,而不是通过你的grps值进行分组? – Beth

+0

@ Beth ..因为每个id和角色组合可以有不同的开始和结束日期..'min()over()'或'max()over()'没有'group by'返回数据中的所有行。为了避免这种情况,我使用了'distinct'。 –

0

你只想分钟(开始)/ MAX(完)每个REPID和角色的日期? 如果是这样,请尝试:

Select 
    repID, role, 
    min(starDate), 
    max(endDate) 
from 
    tbl 
group by 
    repID, role 

- 更详细的解决方案,相当于VKP的:

SELECT 
    repid, ROLE, grpID, 
    MIN(startdate) AS min_startDateOverRole, 
    MAX(endDate) AS max_endDateOverRole 
FROM 
    (SELECT 
     *, CASE WHEN isGrpEnd = 1 THEN 0 ELSE 1 end + 
     -- when on group end row, don't increment grpID. 
     -- Wait until start of next group 
     SUM(isGrpEnd) OVER(ORDER BY startdate) grpID 
     -- sum(all group end rows up to this one) 
    FROM 
      (SELECT 
       *, 
       CASE WHEN lead(ROLE) OVER(ORDER BY startdate) = ROLE 
         THEN 0 ELSE 1 end isGrpEnd 
      FROM t) x ) 
GROUP BY 
    repid, ROLE, grpid 
ORDER BY 
    1,3 
+0

感谢您的回复Beth !!我只是不想要基于代表和角色的最小或最大值。每当角色更改为代表时,我需要一个记录中的开始和结束日期来查看他们在该角色中的时间。 – Naveen