2016-07-20 35 views
0

我正在处理一个设计不良的数据库,只要它们具有不同的唯一标识符,它不会限制重复行。T-SQL:如何从组中获取最后修改的行

在其中一个表中,给定的用户可以具有该属性的属性和值。通常情况下,一个用户只会有一次属性,但由于设计不佳,我在表格中得到了很多重复项,现在我需要清理这些混乱。这是由于CRM软件并不总是在我们修改员工档案时检查该行是否存在,而是创建了一堆具有重复值的新行。

下面的查询返回的重复值:

SELECT ua.ID AS LineID 
    ,ua.Modified AS LineLastModifiedDate 
    ,u.FullName AS EmployeeName 
    ,a.Name AS AttributeName 
    ,ua.value AS AttributeValue 

FROM UserAttributes AS ua 
    INNER JOIN Users AS u ON ua.userid = u.id 
    INNER JOIN Attributes AS a ON ua.AttributeID = a.ID 

WHERE EXISTS (
    SELECT NULL 
    FROM UserAttributes as ua2 
    WHERE ua2.UserID = ua.UserID 
     AND ua2.AttributeID = ua.AttributeID 
     AND ua2.ID != ua.ID 
    ) 

而产生的结果是:

LineID LineLastModifiedDate EmployeeName AttributeName AttributeValue 
------ ----------------------- ------------- --------------- --------------- 
15  2016-01-01    Employee1  EmployeeNumber 15    
19  2016-07-20    Employee1  EmployeeNumber 15    
35  2016-01-01    Employee2  EmployeeSex  M    
96  2016-07-20    Employee2  EmployeeSex  M    
21  2016-03-03    Employee1  SickDays  3    
99  2016-07-10    Employee1  SickDays  5    

我需要完成从这个查询开始是:同样EmployeeName的的ForEach分组和AttributeName,给我最后一个修改后的行,期望如下结果:

LineID LineLastModifiedDate EmployeeName AttributeName AttributeValue 
------ ----------------------- ------------- --------------- --------------- 
19  2016-07-20    Employee1  EmployeeNumber 15    
96  2016-07-20    Employee2  EmployeeSex  M 
99  2016-07-10    Employee1  SickDays  5       

如何修改我的查询来完成此操作?

谢谢

-M

回答

2
;WITH CTE 
AS 
(
SELECT ua.ID AS LineID 
    ,ua.Modified AS LineLastModifiedDate 
    ,u.FullName AS EmployeeName 
    ,a.Name AS AttributeName 
    ,ua.value AS AttributeValue 
    ,ROW_NUMBER() OVER (PARTITION BY EMPLOYEENAME,EMPLOYEESEX ORDER BY UA.Modified DESC) AS RN 
FROM UserAttributes AS ua 
    INNER JOIN Users AS u ON ua.userid = u.id 
    INNER JOIN Attributes AS a ON ua.AttributeID = a.ID 

WHERE EXISTS (
    SELECT NULL 
    FROM UserAttributes as ua2 
    WHERE ua2.UserID = ua.UserID 
     AND ua2.AttributeID = ua.AttributeID 
     AND ua2.ID != ua.ID 
    ) 
) 
SELECT * FROM cte where rn=1 
+0

这实际上工作得很好。谢谢 ! –

0

您可以使用行编号或方案,如下面你拔出最高值,其中,然后使用连接。据推测,你不能按日期关系。

select ... 
from 
    UserAttributes as ua 
    inner join 
    (
    select 
     UserID, AttributeID, 
     max(LineLastModifiedDate) as MaxLineLastModifiedDate 
    fromUserAttributes 
group by UserId 
    ) as max_ua 
     on  max_ua.UserID = ua.UserID 
      and max_ua.AttributeID = max_ua.AttributeID 
      and max_ua.MaxLineLastModifiedDate = ua.LineLastModifiedDate 
    ... 
相关问题