2012-11-16 112 views
0

我想删除表中的任何重复记录,并保留最新记录(根据日期)。在下面的例子中,第一条记录将被删除(hdate = 2012-07-01,id = 16)。删除表中的重复记录并保留最新的

使用SQL Server 2008

感谢

hdate  id   secId  pricesource   price   
---------- ------------ ----------- -------------------- -------------- 
2012-07-01 16   126   DFLT     NULL   
2012-07-02 16   126   DFLT     NULL   
2012-07-01 CAD   20   DFLT     1    
2012-07-01 TWD   99   DFLT     1 

回答

0

我ñ如果您的RDBMS不支持热膨胀系数,或者能够从他们删除(因为你使用的是什么,你还没有上市),这里的其他一切版本:

DELETE FROM TableName as a 
WHERE EXISTS (SELECT '1' 
       FROM TableName b 
       WHERE b.id = a.id -- Plus all other 'duplicate' columns 
        AND b.hdate > a.hdate); 

(和蒂姆的修改Fiddle demo - 虽然由于某些原因,这不适用于SQL Server)。

2

与SQL-Server 2005或更高,您可以使用ROW_NUMBER用适当的OVERCTE

WITH CTE AS 
(
    SELECT hdate, id, secId, pricesource, price, 
    ROW_NUMBER() OVER (PARTITION BY id, secId, pricesource, price ORDER BY hdate DESC) AS RN 
    FROM dbo.TableName t 
) 
DELETE FROM CTE WHERE RN > 1 

Here's a Sql-Fiddle demo

+0

根据海报的例子,处理某些列中可能的空值的好方法将被视为“相等” –

0

这不像Tim的解决方案那样优雅,但不需要CTE。它也处理列中的空值等价物。

DELETE 
FROM MyTable m1 
WHERE EXISTS (
    SELECT 1 
    FROM MyTable m2 
    WHERE 
     (m2.id = m1.id OR (m2.id IS NULL AND m1.id IS NULL)) 
    AND (m2.secId = m1.secId OR (m2.secId IS NULL AND m1.secId IS NULL)) 
    AND (m2.pricesource = m1.pricesource OR (m2.pricesource IS NULL AND m1.pricesource IS NULL)) 
    AND (m2.price = m1.price OR (m2.price IS NULL AND m1.price IS NULL)) 
    AND m2.hdate > m1.hdate 
); 
相关问题