2011-03-25 79 views
6

请考虑在SQL Server中的以下2个语句:SQL服务器 - OUTER APPLY与子查询

这一个是使用嵌套子查询:

WITH cte AS 
(
    SELECT TOP 100 PERCENT * 
    FROM Segments 
    ORDER BY InvoiceDetailID, SegmentID 
) 
SELECT *, ReturnDate = 
       (SELECT TOP 1 cte.DepartureInfo 
        FROM cte 
        WHERE seg.InvoiceDetailID = cte.InvoiceDetailID 
         AND cte.SegmentID > seg.SegmentID), 
      DepartureCityCode = 
       (SELECT TOP 1 cte.DepartureCityCode 
        FROM cte 
        WHERE seg.InvoiceDetailID = cte.InvoiceDetailID 
         AND cte.SegmentID > seg.SegmentID) 
FROM Segments seg 

而这种使用外部应用运算符:

WITH cte AS 
(
    SELECT TOP 100 PERCENT * 
    FROM Segments 
    ORDER BY InvoiceDetailID, SegmentID 
) 
SELECT seg.*, t.DepartureInfo AS ReturnDate, t.DepartureCityCode 
FROM Segments seg OUTER APPLY (
       SELECT TOP 1 cte.DepartureInfo, cte.DepartureCityCode 
       FROM cte 
       WHERE seg.InvoiceDetailID = cte.InvoiceDetailID 
         AND cte.SegmentID > seg.SegmentID 
      ) t 

考虑到两个Segments表可能有数百万行,这两个表中的哪一个可能执行得更好?

我的直觉是OUTER APPLY会表现更好。

一对夫妇的更多的问题:

  1. 我几乎敢肯定这一点,但还是想证实的是,在第一个方案中,CTE将有效地执行两次(因为其引用了两次和热膨胀系数像宏一样内联展开)。
  2. 当在OUTER APPLY运算符中使用时,CTE是否会针对每行执行一次?当在第一个语句的嵌套查询中使用时,它也会为每一行执行?
+8

运行,检查查询计划 – 2011-03-25 15:53:00

+1

“TOP 100 PERCENT ... ORDER BY”已被优化,并且没有任何效果。我同意第二个应该表现更好。你也可以看看'ROW_NUMBER'和'PARTITION BY'来获得每个组的'TOP 1'。 – 2011-03-25 16:00:44

回答

4

首先,在CTE中摆脱Top 100 Percent。您在此处不使用TOP,如果您想要对结果进行排序,则应在整个声明的末尾添加一个Order By。其次,为了解决你关于性能的问题,如果被迫猜测,我的赌注只会在第二种形式,因为它只有一个子查询而不是两个。第三,另一种形式,你可以尝试将是:

With RankedSegments As 
    (
    Select S1.SegmentId, ... 
     , Row_Number() Over(Partition By S1.SegmentId Order By S2.SegmentId) As Num 
    From Segments As S1 
     Left Join Segments As S2 
      On S2.InvoiceDetailId = S1.InvoiceDetailId 
       And S2.SegmentId > S1.SegmentID 
    ) 
Select ... 
From RankedSegments 
Where Num = 1 

另一种可能性

With MinSegments As 
    (
    Select S1.SegmentId, Min(S2.SegmentId) As MinSegmentId 
    From Segments As S1 
     Join Segments As S2 
      On S2.InvoiceDetailId = S1.InvoiceDetailId 
       And S2.SegmentId > S1.SegmentID 
    Group By S1.SegmentId 
    ) 
Select ... 
From Segments As S1 
    Left Join (MinSegments As MS1 
     Join Segments As S2 
      On S2.SegmentId = MS1.MinSegmentId) 
     On MS1.SegmentId = S1.SegmentId 
+0

@Thomas:ORDER BY是因为OUTER APPLY/Nested查询需要针对排序的右表运行,您看到我需要TOP 1行,而且必须来自Sorted表,这就是为什么TOP 100 PERCENT与在那里订购。 嗯...我认为ROW_NUMBER也是一个不错的选择,不知道我是如何错过了自己:(我会检查并得到回... – 2011-03-27 08:33:54

+0

@Thomas:第二个查询的CTE缺少'GROUP BY'。 – 2011-03-27 11:17:51

+0

@Andriy M - Doah。Fixed。Thanks。 – Thomas 2011-03-27 16:22:57

1

也许我会用托马斯的查询的这种变化:

WITH cte AS 
(
SELECT *, Row_Number() Over(Partition By SegmentId Order By InvoiceDetailID, SegmentId) As Num 
FROM Segments) 
SELECT seg.*, t.DepartureInfo AS ReturnDate, t.DepartureCityCode 
FROM Segments seg LEFT JOIN cte t ON seg.InvoiceDetailID = t.InvoiceDetailID AND t.SegmentID > seg.SegmentID AND t.Num = 1 
+0

如果SegmentId是PK,则每行的num将为1。 – Thomas 2011-03-27 16:26:34

+0

嗯..谢谢你指出这个... – 2011-04-07 15:28:21

+0

@Thomas:是SegmentId是PK。我认为任何基于PARTITION BY或OVER子句的解决方案在这种情况下都不可行,包括您发布的解决方案。 – 2011-04-07 15:38:01