2009-08-08 78 views
1

对不起,我不确定如何使用该语句,而且我对SQL的使用并不太好。数据库引擎是SQL Server Compact。我现在有这个疑问:SQL:为每个唯一密钥选择最大值?

SELECT * 
FROM Samples 
WHERE FunctionId NOT IN 
(SELECT CalleeId FROM Callers) 
ORDER BY ThreadId, HitCount DESC 

这给了我:

ThreadId Function HitCount 
     1  164  6945 
     1  3817   1 
     4  1328  7053 

现在,我只想与线程的每个唯一值的最大的命中计数结果。换句话说,应该放弃第二行。我不知道如何解决这个问题。

[编辑]如果有帮助,这是同一个查询的另一种形式:

SELECT * 
FROM Samples s1 
LEFT OUTER JOIN Callers c1 
    ON s1.ThreadId = c1.ThreadId AND s1.FunctionId = c1.CalleeId 
WHERE c1.ThreadId IS NULL 
ORDER BY ThreadId 

[编辑]我最终使架构更改,以避免这样做,因为建议的查询者望而昂贵。感谢所有的帮助。

+0

是否有可能为那里是共享同一FunctionId 2个线程ID,对那些(ThreadId,FunctionId)在呼叫者,但不是其他?我问,因为上面的两个查询没有说同样的事情。 – 2009-08-08 04:54:47

回答

2

SQL Server紧凑支持窗口函数吗?

备选方案1 - 将包括所有打结的行。不包括行,如果给定线程的唯一行都有空的HitCount:

SELECT Thread, Function, HitCount 
FROM (SELECT Thread, Function, HitCount, 
     MAX(HitCount) over (PARTITION BY Thread) as MaxHitCount 
    FROM Samples 
    WHERE FunctionId NOT IN 
     (SELECT CalleeId FROM Callers)) t 
WHERE HitCount = MaxHitCount 
ORDER BY ThreadId, HitCount DESC 

方案2 - 将包括将所有行。如果没有行,并与非空HitCount给定的线程,将返回所有行该线程:

SELECT Thread, Function, HitCount 
FROM (SELECT Thread, Function, HitCount, 
     RANK() over (PARTITION BY Thread ORDER BY HitCount DESC) as R 
    FROM Samples 
    WHERE FunctionId NOT IN 
     (SELECT CalleeId FROM Callers)) t 
WHERE R = 1 
ORDER BY ThreadId, HitCount DESC 

方案3 - 将非determistically挑领带的情况下,一排并丢弃等。将包括行如果在一个线程中的所有行具有空HitCount

SELECT Thread, Function, HitCount 
FROM (SELECT Thread, Function, HitCount, 
     ROW_NUMBER() over (PARTITION BY Thread ORDER BY HitCount DESC) as R 
    FROM Samples 
    WHERE FunctionId NOT IN 
     (SELECT CalleeId FROM Callers)) t 
WHERE R = 1 
ORDER BY ThreadId, HitCount DESC 

替代4 & 5 - 使用旧的结构,如果窗口功能不可用,并说是什么意思干净了一点比使用连接。基准如果spead是一个优先事项。两者都返回参与平局的所有行。当非空值不适用于HitCount时,备选4将HitCount为空。选项5不会返回HitCount为空的行。

WITH maxHits AS(
    SELECT s.threadid, 
     MAX(s.hitcount) 'maxhits' 
    FROM SAMPLES s 
    JOIN CALLERS c ON c.threadid = s.threadid AND c.calleeid != s.functionid 
GROUP BY s.threadid 
) 
SELECT t.* 
    FROM SAMPLES t 
    JOIN CALLERS c ON c.threadid = t.threadid AND c.calleeid != t.functionid 
    JOIN maxHits mh ON mh.threadid = t.threadid AND mh.maxhits = t.hitcount 

任何数据库的工作:

SELECT * 
FROM Samples s1 
WHERE FunctionId NOT IN 
    (SELECT CalleeId FROM Callers) 
AND NOT EXISTS 
    (SELECT * 
    FROM Samples s2 
    WHERE s1.FunctionId = s2.FunctionId 
    AND s1.HitCount < s2.HitCount) 
ORDER BY ThreadId, HitCount DESC 

SELECT * 
FROM Samples s1 
WHERE FunctionId NOT IN 
    (SELECT CalleeId FROM Callers) 
AND HitCount = 
    (SELECT MAX(HitCount) 
    FROM Samples s2 
    WHERE s1.FunctionId = s2.FunctionId) 
ORDER BY ThreadId, HitCount DESC 
2

这是我会怎么做:

SELECT s1.* 
FROM Samples s1 
LEFT JOIN Samples s2 
    ON (s1.Thread = s2.Thread and s1.HitCount < s2.HitCount) 
WHERE s1.FunctionId NOT IN (SELECT CalleeId FROM Callers) 
    AND s2.Thread IS NULL 
ORDER BY s1.ThreadId, s1.HitCount DESC 

换句话说,该行s1对其中有没有其他行s2匹配相同Thread而具有较大HitCount

+0

这里有一个令人讨厌的细微之处 - 嵌套子条款(SELECT CalleeId FROM Callers)也必须应用于连接的另一半。我重新调整了一下初始查询: SELECT s1。* 的样品S1 LEFT OUTER JOIN呼叫者C1 ON s1.ThreadId = c1.ThreadId AND s1.FunctionId = c1.CalleeId WHERE c1.ThreadId IS NULL – Promit 2009-08-08 01:02:24

1

将与SQL Server 2005+工作

SELECT t.* 
    FROM SAMPLES t 
    JOIN CALLERS c ON c.threadid = t.threadid AND c.calleeid != t.functionid 
    JOIN (SELECT s.threadid, 
       MAX(s.hitcount) 'maxhits' 
      FROM SAMPLES s 
      JOIN CALLERS c ON c.threadid = s.threadid AND c.calleeid != s.functionid 
     GROUP BY s.threadid) mh ON mh.threadid = t.threadid AND mh.maxhits = t.hitcount 
+0

如果(SELECT s.ThreadAID MAX(s.HitCount)从样品中s GROUP BY s.ThreadId)返回一个最大值,它只属于FunctionId在Callers.CallerId中的行吗?这些行最终被删除,并且来自FuctionId不在Callers.CallerId中的该行的Max(HitCount)被排除在结果之外。 – 2009-08-08 01:49:57

+0

好点。通过在CTE或内联视图中向CALLERS添加JOIN轻松进行更正 - 请参阅更新。 – 2009-08-08 01:58:21

+0

@rexem。注意到另一个问题:OP中没有任何东西让我相信调用者有一个ThreadId列。 – 2009-08-08 02:43:14