2013-01-23 185 views
2

我有两个具有相同结构的表。
我该如何检查这两个行中的所有行是否相等?
即第一个表中的每一行存在于另一个中,反之亦然。检查两个表是否相等

回答

0

这是一个有趣的。我不知道是否有一个更好的或者更简单的方式来做到这一点,但这样的事情可能工作:

假设你有两个表,T1和T2,和他们每个人都有两列,C1和C2

create view t1_counts 
as select c1, c2, count(*) as num 
from t1 
group by c1, c2; 

create view t2_counts 
as select c1, c2, count(*) as num 
from t2 
group by c1, c2; 

select t1_counts.c1, t1_counts.c2, t1_counts.num, t2_counts.num 
from t1_counts full outer join t2_counts on (t1_counts.c1 = t2_counts.c1 and t1_counts.c2 = t2_counts.c2) 
where t1_counts.num != t2_counts.num; 

如果两个表相等,则输出将为空。

1

Jeff的博客解决方案与Hive相关:http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx

“其基本思想是:如果我们将所有列上的两个表的联合进行分组,那么如果两个表相同,则所有组都将导致2的COUNT(*)。但对于任何行在GROUP BY子句的任何一列上都没有完全匹配,COUNT(*)将是1 - 这些都是我们想要的。我们还需要在UNION的每个部分添加一列以指示每行到哪个表从,否则没有办法区分哪一行来自哪个表。“

处理重复的改进方案被公布为注释:http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx#3155 (再现代码,因为它是从注释最初发布用户“佩里”)

SELECT MIN(TableName) as TableName, COL1, COL2, COL3 ... 
    FROM 
    (
    SELECT 'Table A' as TableName, COUNT(*) NDUPS, A.COL1, A.COL2, A.COL3, ... 
    FROM Table1 A GROUP BY ID, COL1, COL2, COL3 ... 
    UNION ALL 
    SELECT 'Table B' as TableName, COUNT(*) NDUPS, B.COL1, B.COl2, B.COL3, ... 
    FROM Table2 B 
    GROUP BY ID, COL1, COL2, COL3 ... 
    ) tmp 
    GROUP BY NDUPS, ID, COL1, COL2, COL3 ... 
    HAVING COUNT(*) = 1 
    ORDER BY ID 
+0

愿您发表的总结为每个链接?这样,如果链接断开,信息不会丢失。 – fxm