2014-01-07 60 views
2

如何输入具有相同条目的条目? 我的桌面播放器有两个方面(名称,wowp_id)重复。如何合并它们?在sqlite中删除重复的行?

我一直在寻找相关的问题。下面的代码建立在我找到的答案上。流程运行良好,但重复仍然存在。我想没有重复名称。如果有不同的wowp_id多个我宁愿删除wowp_id并保持只有一个条目。

def sql_removeduplicates(): 
    con = sqlite3.connect('WOWT.sql') 
    with con:  
     cur = con.cursor()  
     cur.execute("SELECT name, COUNT(*) FROM players GROUP BY name, team, wowp_id HAVING COUNT(*) > 1") 
     rows = cur.fetchall() 
     con.commit() 
     for row in rows: 
      print row 

我的球员行:

(id, wowp_id, name, team) 
(108, 501078041, u'prazluges', None) 
(109, 507894244, u'Aidis', None) 
(110, 500742127, u'Aidis', None) 
(111, u'Aidis', u'Aidis', None) 
(112, u'Aidis', u'Aidis', None) 
(113, 500864543, u'prazluges', None) 
(114, u'Aidis', u'Aidis', None) 
(115, u'Aidis', u'Aidis', None) 
(116, u'Aidis', u'Aidis', None) 
(117, 501078041, u'satih', None) 
(118, u'Aidis', u'Aidis', None) 
+0

还要注意,sqlite3的有一个可用于删除重复行的隐藏ROWID列,请参阅以下答案:https:// stackoverfl ow.com/questions/8190541/deleting-duplicate-rows-from-sqlite-database – Gnudiff

回答

1

导入你的结果与包含定义,让你想要什么方法的自定义类的set。示例如下:

class players: 
    def __contains__(self, item): 
     return self.playersObj.name != item.name 
    # Your other methods go here 

然后将您的行导入玩家实例并将其写回。

0

听起来就像你试图为相同的名称删除重复的WOWP_ID。我假设你保持每个NAME的最大WOWP_ID。如果您的表中有一个可靠的唯一键(如主键),则答案非常简单。如果没有这样一个键,您可以尝试这样的事:

import unittest 
import sqlite3 

class DaoTest(unittest.TestCase):     
    def testDeleteDuplicates(self): 
     with sqlite3.connect("WOWT.sql") as conn: 
      rowsToDelete = conn.execute(''' 
       SELECT PLAYERS.NAME, PLAYERS.TEAM, PLAYERS.WOWP_ID FROM PLAYERS INNER JOIN 
       (
        SELECT PLAYERS.NAME, MAX(WOWP_ID) AS MAX_ID FROM PLAYERS INNER JOIN 
        (
         SELECT NAME, COUNT(DISTINCT WOWP_ID) AS DUP FROM PLAYERS 
         GROUP BY NAME 
         HAVING DUP > 1 
        ) DUPTABLE 
        ON PLAYERS.NAME = DUPTABLE.NAME 
        GROUP BY PLAYERS.NAME 
       ) RowsToKeep 
       ON PLAYERS.NAME = RowsToKeep.NAME AND PLAYERS.WOWP_ID <> MAX_ID 
      ''') 
      conn.executemany("DELETE FROM PLAYERS WHERE NAME = ? AND TEAM = ? AND WOWP_ID = ?", rowsToDelete) 
+0

这一个似乎比其他建议更复杂一点。但是,谢谢。 – Aidis

2

你可以做

DELETE FROM players 
WHERE id NOT IN 
(
    SELECT MIN(id) id 
    FROM players 
    GROUP BY wowp_id, name 
); 

注:DELETE继续之前确保你有一个坚实后盾的数据。

从表中删除重复后确保重复数据删除后

CREATE UNIQUE INDEX idx_wowp_id_name ON players(wowp_id, name); 

成果创造UNIQUE约束:

 
| id | wowp_id |  name | team | 
|-----|-----------|-----------|------| 
| 108 | 501078041 | prazluges | None | 
| 109 | 507894244 |  Aidis | None | 
| 110 | 500742127 |  Aidis | None | 
| 111 |  Aidis |  Aidis | None | 
| 113 | 500864543 | prazluges | None | 
| 117 | 501078041 |  satih | None | 

下面是SQLFiddle演示

+0

UNIQUE约束是做什么的?为什么我需要它? – Aidis