2015-10-12 53 views
0

我有一个包含两个进程ID(processIDA int,ProcessIDB int)的“冲突”表。SQL - 筛选器冗余重复

当以任意顺序(A/B或B/A)输入到此“冲突”表中的2个进程时定义了唯一冲突。

的冲突表包含重复如下:

[row..1] ProcessIDA = ,ProcessIDB =

[row..2] ProcessIDB = ,ProcessIDA =

我需要做的是过滤掉重复的冲突使得我只剩:

[row..1] ProcessIDA = ,ProcessIDB =

注:本表的行可以在5之间变化,并且5000万条记录。一旦我可以成功地过滤出重复项,行数将会恰好是当前的一半。

+0

所以,你要删除查询? –

+0

只要where子句过滤器就没问题(如果可能的话)。然后,我可以使用where子句删除或执行其他查询。 – TheLegendaryCopyCoder

回答

3

如果你想删除重复项,然后

查询

;with cte as 
(
    select *, 
    case when ProcessIDA < ProcessIDB 
    then ProcessIDA else ProcessIDB end as column1, 
    case when ProcessIDA < ProcessIDB 
    then ProcessIDB else ProcessIDA end as column2 
    from conflicts 
), 
cte2 as 
(
    select rn = row_number() over 
    (
     partition by cte.column1,cte.column2 
     order by cte.column1 
    ),* 
    from cte 
) 
delete from cte2 
where rn > 1; 

SQL Fiddle

1

你可以做一个简单的自我加入

;WITH Conflicts AS 
(
    SELECT  * 
    FROM ( VALUES 
       (6, 5), 
       (5, 6), 
       (1, 2), 
       (1, 3) 
      ) Sample (ProcessIDA, ProcessIDB) 
) 
SELECT A.* 
FROM Conflicts A 
JOIN Conflicts B 
    ON A.ProcessIDA = B.ProcessIDB AND 
     A.ProcessIDB = B.ProcessIDA