2010-07-23 199 views
70

考虑一个名为EmployeeName的表格Employee。目标是根据EmployeeName字段删除重复的记录。删除SQL Server中的重复记录?

EmployeeName 
------------ 
Anand 
Anand 
Anil 
Dipak 
Anil 
Dipak 
Dipak 
Anil 

使用一个查询,我想要删除重复的记录。

如何在SQL Server中使用TSQL完成这项工作?

+0

你的意思是删除重复的记录,对吧? – Sarfraz 2010-07-23 10:51:03

+0

您可以选择不同的值及其相关的ID,并删除ID不在已选列表中的那些记录? – DaeMoohn 2010-07-23 10:53:45

+1

你有一个唯一的ID列吗? – 2010-07-23 10:54:12

回答

158

你可以用窗口函数做到这一点。它会通过empId命令这些模糊,并删除除第一个之外的所有模板。

delete x from (
    select *, rn=row_number() over (partition by EmployeeName order by empId) 
    from Employee 
) x 
where rn > 1; 

运行它作为一个选择,看看有什么会被删除:

select * 
from (
    select *, rn=row_number() over (partition by EmployeeName order by empId) 
    from Employee 
) x 
where rn > 1; 
+0

非常聪明.... – 2015-10-02 21:43:15

+2

如果你没有主键,你可以使用'ORDER BY(SELECT NULL)'http://stackoverflow.com/a/4812038 – Arithmomaniac 2016-07-01 16:57:14

7

你可以尝试像以下:

delete T1 
from MyTable T1, MyTable T2 
where T1.dupField = T2.dupField 
and T1.uniqueField > T2.uniqueField 

(这里假设你有一个基于整数的唯一字段)

个人,虽然我会说你是最好试图纠正的事实在数据库发生重复条目之前将其添加到数据库中,而不是作为修复后操作。

+0

我没有在我的唯一字段(ID)表。那我该如何执行操作呢? – usr021986 2010-07-24 04:20:40

27

假设你的Employee表还具有独特的列(ID在下面的例子),下面的工作:

delete from Employee 
where ID not in 
(
    select min(ID) 
    from Employee 
    group by EmployeeName 
); 

这将使版本与表中最低的ID。

编辑
重新McGyver的评论 - 的SQL 2012

MIN为可用于数字,CHAR,VARCHAR,唯一标识符,或d​​atetime列使用,但不能与bit列

对于2008 R2及更早版本,

MIN可以用数字,CHAR,VARCHAR或datetime列使用,但不能与bit列(它也不会GUID的工作)

对于2008R2你需要转换GUIDMIN支持的类型,例如

delete from GuidEmployees 
where CAST(ID AS binary(16)) not in 
(
    select min(CAST(ID AS binary(16))) 
    from GuidEmployees 
    group by EmployeeName 
); 

SqlFiddle for various types in Sql 2008

SqlFiddle for various types in Sql 2012

+0

另外,在Oracle中,如果没有其他唯一标识列,则可以使用“rowid”。 – 2010-07-23 11:13:03

+0

+1即使没有ID列,也可以将其添加为标识字段。 – 2010-07-23 15:31:15

2
WITH CTE AS 
(
    SELECT EmployeeName, 
      ROW_NUMBER() OVER(PARTITION BY EmployeeName ORDER BY EmployeeName) AS R 
    FROM employee_table 
) 
DELETE CTE WHERE R > 1; 

公用表表达式的魔力。

+0

SubPortal/a_horse_with_no_name - 不应该这样做从实际的表中选择?此外,ROW_NUMBER应该是ROW_NUMBER(),因为它是一个函数,是否正确? – MacGyver 2014-02-17 07:23:39

2
DELETE 
FROM MyTable 
WHERE ID NOT IN (
    SELECT MAX(ID) 
    FROM MyTable 
    GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3) 

WITH TempUsers (FirstName, LastName, duplicateRecordCount) 
AS 
(
    SELECT FirstName, LastName, 
    ROW_NUMBER() OVER (PARTITIONBY FirstName, LastName ORDERBY FirstName) AS duplicateRecordCount 
    FROM dbo.Users 
) 
DELETE 
FROM TempUsers 
WHERE duplicateRecordCount > 1 
1

尝试

DELETE 
FROM employee 
WHERE rowid NOT IN (SELECT MAX(rowid) FROM employee 
GROUP BY EmployeeName); 
1

如果你正在寻找一种方式来删除重复的,但你有一个外键指向与重复表,你可以采取下面的方法使用缓慢而有效的光标。

它将重新定位外键表上的重复键。

create table #properOlvChangeCodes(
    id int not null, 
    name nvarchar(max) not null 
) 

DECLARE @name VARCHAR(MAX); 
DECLARE @id INT; 
DECLARE @newid INT; 
DECLARE @oldid INT; 

DECLARE OLVTRCCursor CURSOR FOR SELECT id, name FROM Sales_OrderLineVersionChangeReasonCode; 
OPEN OLVTRCCursor; 
FETCH NEXT FROM OLVTRCCursor INTO @id, @name; 
WHILE @@FETCH_STATUS = 0 
BEGIN 
     -- determine if it should be replaced (is already in temptable with name) 
     if(exists(select * from #properOlvChangeCodes where [email protected])) begin 
      -- if it is, finds its id 
      Select top 1 @newid = id 
      from Sales_OrderLineVersionChangeReasonCode 
      where Name = @name 

      -- replace terminationreasoncodeid in olv for the new terminationreasoncodeid 
      update Sales_OrderLineVersion set ChangeReasonCodeId = @newid where ChangeReasonCodeId = @id 

      -- delete the record from the terminationreasoncode 
      delete from Sales_OrderLineVersionChangeReasonCode where Id = @id 
     end else begin 
      -- insert into temp table if new 
      insert into #properOlvChangeCodes(Id, name) 
      values(@id, @name) 
     end 

     FETCH NEXT FROM OLVTRCCursor INTO @id, @name; 
END; 
CLOSE OLVTRCCursor; 
DEALLOCATE OLVTRCCursor; 

drop table #properOlvChangeCodes 
-1

请看下面的删除方法。

Declare @Employee table (EmployeeName varchar(10)) 

Insert into @Employee values 
('Anand'),('Anand'),('Anil'),('Dipak'), 
('Anil'),('Dipak'),('Dipak'),('Anil') 

Select * from @Employee 

enter image description here

创建名为@Employee一个示例表,并与给定的数据加载它。

Delete aliasName from (
Select *, 
     ROW_NUMBER() over (Partition by EmployeeName order by EmployeeName) as rowNumber 
From @Employee) aliasName 
Where rowNumber > 1 

Select * from @Employee 

结果:

enter image description here

我知道,这是六年前问,发帖只是柜面这是任何人都很有帮助。