2012-10-02 18 views
2


的SQL Server将多个数据集,而不会考虑三个表TA,TB,TC重复数据

Ta(ID, Field1) 
Tb(ID, Field2) 
Tc(ID, Field3) 

鉴于数据例如:

Ta 
ID Field1 
--------- 
1 A 
1 B 

Tb 
ID Field2 
--------- 
1 C 
1 D 
2 E 

Tc 
ID Field3 
--------- 
1 F 
2 G 
2 H 

问: 我怎样才能加入这个要返回的数据:

ID Field1 Field2 Field3 
----------------------- 
1 A  C  F 
1 B  D  NULL 
2 NULL E  G 
2 NULL NULL H 

我认为我可以通过外连接实现这一点,但似乎并非如此。只要我带回没有重复行的所有信息,分组顺序并不重要。

只是为了澄清。只要结果集返回最小行数中的所有数据,我并不介意使用哪种组合。下面是我想要做的更实际的例子:

给定一个人,叫他约翰。他有两个电话号码和三个电子邮件地址:

PID Email 
--------- 
John [email protected] 
John [email protected] 
John [email protected] 

PID Tel 
-------- 
John 011 
John 022 

我要回:

PID Email   Tel 
---------------------- 
John [email protected] 011 
John [email protected] 022 
John [email protected] NULL 
+3

为什么是分配给ID = 3的最后一排? – Lamak

+1

'1 A D F'?和'1 B C F'?为什么不回报他们,他们显然是从你的回报中失踪,不是吗?在你回应之前,停下来思考,也许你会意识到真正的问题与你的需求... –

+3

你的组合似乎取决于原始表中的行的排序。在SQL(和SQL Server)中,表中行的顺序未指定。你有行号或标识列或日期或确定订购的东西吗? –

回答

3

您可以用下面的接近:正如我所说的

select coalesce(ta.id, tb.id, tc.id), ta.field1, tb.field2, tc.field3 
from (select ta.*, row_number() over (partition by id order by (select NULL)) as seqnum 
     from ta 
    ) ta full outer join 
    (select tb.*, row_number() over (partition by id order by (select NULL)) as seqnum 
     from tb 
    ) tb 
    on ta.id = tb.id and 
     ta.seqnum = tb.seqnum 
    (select tc.*, row_number() over (partition by id order by (select NULL)) as seqnum 
     from tc 
    ) tc 
    on coalesce(ta.id, tb.id) = tc.id and 
     coalesce(ta.seqnum, tb.seqnum) = tc.seqnum 
group by coalesce(ta.id, tb.id, tc.id), 
     coalesce(ta.seqnum, tb.seqnum, tc.seqnum) 
order by 1, 2 

,不过,在我的评论中,表格中行的排序不能保证,所以这些可能不会按照您期望的顺序出现。有了您的样本数据,你可以使用:

over (partition by id order by field<n>) 

如果字段定义排序

3

这里有一个替代方案,使用CTE的和联盟,与MIN排除空值。它不能保证排序,但正如你所说,只要身份证全都存在,你就不在意。

SQL小提琴here

WITH TaRanked AS 
(
    SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Field1) as Rnk, ID, Field1 
    FROM Ta 
), 
TbRanked AS 
(
    SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Field2) as Rnk, ID, Field2 
    FROM Tb 
), 
TcRanked AS 
(
    SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Field3) as Rnk, ID, Field3 
    FROM Tc 
), 
TUnion AS 
(
    SELECT Rnk, ID, Field1, NULL AS Field2, NULL AS Field3 
     FROM TaRanked 
    UNION ALL 
    SELECT Rnk, ID, NULL, Field2, NULL 
     FROM TbRanked 
    UNION ALL 
    SELECT Rnk, ID, NULL, NULL, Field3 
     FROM TcRanked 
) 
SELECT ID, MIN(Field1), MIN(Field2), MIN(Field3) 
    FROM TUnion 
    GROUP BY ID, Rnk 
    ORDER BY ID, Rnk 

结果是

1 A  C  F 
1 B  D  (null) 
2 (null) E  G 
2 (null) (null) H 
+0

由于Gordon Linoff的回答为我工作,我没有机会尝试此操作。也就是说,这似乎是以相同的方式进行的,但以更具可读性的方式(+1)。我会让选民从这里拿走它! :) –

+0

Gordon是正确的 - 需要分区来保证最小行数。这也会设置所需的顺序。我已更新。 – StuartLC

相关问题