2010-06-14 36 views
4

我一直在使用SQL Server十年来的更好的一部分,这个分组(或分区,或排名...我不知道答案是什么! )有一个让我难住。感觉它也应该是一件容易的事。我会概括我的问题:SQL Server:一个令人讨厌的分组问题

假设我有3名员工(不必担心他们会退出或任何其他事情......总是有3个),并且我会跟上每月如何分配他们的薪水。

Month Employee PercentOfTotal 
-------------------------------- 
1  Alice  25% 
1  Barbara 65% 
1  Claire 10% 

2  Alice  25% 
2  Barbara 50% 
2  Claire 25% 

3  Alice  25% 
3  Barbara 65% 
3  Claire 10% 

正如你所看到的,我已经向他们支付相同%的1个月和3,但在第2个月,我已经给爱丽丝一样25%,但芭芭拉得到了50%和克莱尔得到25 %。

我想知道的是我所给出的所有不同的分布。在这种情况下,会出现两个 - 第一个月和第三个,第二个一个。

我期望结果看起来像这样(注意:ID或音序器或其他无所谓)

ID  Employee PercentOfTotal 
-------------------------------- 
X  Alice  25% 
X  Barbara 65% 
X  Claire 10% 

Y  Alice  25% 
Y  Barbara 50% 
Y  Claire 25% 

看起来很简单吧?我很难过!任何人都有优雅的解决方案在写这个问题时,我只是把这个解决方案放在一起,这似乎工作,但我想知道是否有更好的方法。或者,也许我会学到一些东西的另一种方式。

WITH temp_ids (Month) 
AS 
(
    SELECT DISTINCT MIN(Month) 
    FROM employees_paid 
    GROUP BY PercentOfTotal 
) 
SELECT EMP.Month, EMP.Employee, EMP.PercentOfTotal 
    FROM employees_paid EMP 
     JOIN temp_ids IDS ON EMP.Month = IDS.Month 
GROUP BY EMP.Month, EMP.Employee, EMP.PercentOfTotal 

谢谢你们啦! -Ricky

回答

2

我假设的表现不会很大(子查询的原因)

SELECT * FROM employees_paid where Month not in (
    SELECT 
      a.Month 
    FROM 
      employees_paid a 
      INNER JOIN employees_paid b ON 
       (a.employee = B.employee AND 
       a.PercentOfTotal = b.PercentOfTotal AND 
       a.Month > b.Month) 
    GROUP BY 
      a.Month, 
      b.Month 
    HAVING 
      Count(*) = (SELECT COUNT(*) FROM employees_paid c 
       where c.Month = a.Month) 
    ) 
  1. 内部的选择做了自加盟识别匹配员工和比例组合(除了那些同一个月)。 JOIN中的>确保只进行一组匹配,即如果Month1条目= Month3条目,我们只获取Month3-Month1条目组合而不是Month1-Month3,Month3-Month1和Month3-Month3。
  2. 然后,我们通过集团每个月的月组合
  3. 匹配条目然后HAVING排除几个月没有尽可能多的比赛,因为有一个月的条目COUNT
  4. 外SELECT获取除了那些所有条目通过内部查询返回(带全套的那些比赛)
+0

嘿谢谢 - 优雅,在一般意义上的作品,和很好的解释。对于我来说,性能并不是那么重要,因为它是一次性数据转换脚本,而不是生产级代码。 – user366729 2010-06-15 15:14:00

4

这给你一个答案比你要求的格式稍有不同:

SELECT DISTINCT 
    T1.PercentOfTotal AS Alice, 
    T2.PercentOfTotal AS Barbara, 
    T3.PercentOfTotal AS Claire 
FROM employees_paid T1 
JOIN employees_paid T2 
    ON T1.Month = T2.Month AND T1.Employee = 'Alice' AND T2.Employee = 'Barbara' 
JOIN employees_paid T3 
    ON T2.Month = T3.Month AND T3.Employee = 'Claire' 

结果:

Alice Barbara Claire 
25%  50%  25% 
25%  65%  10% 

如果你愿意,你可以使用UNPIVOT把这个结果设置为你要求的形式。

SELECT rn AS ID, Employee, PercentOfTotal 
FROM (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Alice) AS rn 
    FROM (
     SELECT DISTINCT 
      T1.PercentOfTotal AS Alice, 
      T2.PercentOfTotal AS Barbara, 
      T3.PercentOfTotal AS Claire 
     FROM employees_paid T1 
     JOIN employees_paid T2 ON T1.Month = T2.Month AND T1.Employee = 'Alice' 
                 AND T2.Employee = 'Barbara' 
     JOIN employees_paid T3 ON T2.Month = T3.Month AND T3.Employee = 'Claire' 
    ) T1 
) p UNPIVOT (PercentOfTotal FOR Employee IN (Alice, Barbara, Claire)) AS unpvt 

结果:

ID Employee PercentOfTotal 
1 Alice  25% 
1 Barbara 50%  
1 Claire 25%    
2 Alice  25%    
2 Barbara 65%    
2 Claire 10%    
+0

感谢您的UNPIVOT建议 - - 我以前没有用过的东西。 – user366729 2010-06-15 14:56:37

2

如果我理解正确的,你那么,对于一个通用的解决方案,我想你会需要连接整个集团一起 - 例如生产Alice:0.25, Barbara:0.50, Claire:0.25。然后选择不同的组,以便像下面这样做(相当笨拙)。

WITH EmpSalaries 
AS 
(

SELECT 1 AS Month, 'Alice' AS Employee, 0.25 AS PercentOfTotal UNION ALL 
SELECT 1 AS Month, 'Barbara' AS Employee, 0.65 UNION ALL 
SELECT 1 AS Month, 'Claire' AS Employee, 0.10 UNION ALL 

SELECT 2 AS Month, 'Alice' AS Employee, 0.25 UNION ALL 
SELECT 2 AS Month, 'Barbara' AS Employee, 0.50 UNION ALL 
SELECT 2 AS Month, 'Claire' AS Employee, 0.25 UNION ALL 

SELECT 3 AS Month, 'Alice' AS Employee, 0.25 UNION ALL 
SELECT 3 AS Month, 'Barbara' AS Employee, 0.65 UNION ALL 
SELECT 3 AS Month, 'Claire' AS Employee, 0.10 
), 
Months AS 
(
SELECT DISTINCT Month FROM EmpSalaries 
), 
MonthlySummary AS 
(
SELECT Month, 
Stuff(
      (
      Select ', ' + S1.Employee + ':' + cast(PercentOfTotal as varchar(20)) 
      From EmpSalaries As S1 
      Where S1.Month = Months.Month 
      Order By S1.Employee 
      For Xml Path('') 
      ), 1, 2, '') As Summary 
FROM Months 
) 
SELECT * FROM EmpSalaries 
WHERE Month IN (SELECT MIN(Month) 
       FROM MonthlySummary 
       GROUP BY Summary) 
+0

正确 - 这与我的客户端当前在系统中提取这些数字的方式类似,然后解析字符串。我正在将他们的旧数据转移到我们的新系统中,这个系统将这个标准化,消除了需求。我认为可能有一个“简单”的解决方案返回表值 - 看起来不像我想象的那么常见! – user366729 2010-06-15 15:10:53

3

你想要的是每个月的分配作为一个签名或值的模式,然后你想在其他月份找到。不清楚的是,价值所在的员工与百分比的分解一样重要。例如,在你的例子中,爱丽丝= 65%,芭芭拉= 25%,克莱尔= 10%是否与第3月相同?在我的例子中,我推测它不会是一样的。与马丁史密斯的解决方案类似,我通过将每个百分比乘以10来找到签名。这假定所有百分比值都小于1。例如,如果某人可能拥有110%的比例,那么会为此解决方案造成问题。

With Employees As 
    (
    Select 1 As Month, 'Alice' As Employee, .25 As PercentOfTotal 
    Union All Select 1, 'Barbara', .65 
    Union All Select 1, 'Claire', .10 
    Union All Select 2, 'Alice', .25 
    Union All Select 2, 'Barbara', .50 
    Union All Select 2, 'Claire', .25 
    Union All Select 3, 'Alice', .25 
    Union All Select 3, 'Barbara', .65 
    Union All Select 3, 'Claire', .10 
    ) 
    , EmployeeRanks As 
    (
    Select Month, Employee, PercentOfTotal 
     , Row_Number() Over (Partition By Month Order By Employee, PercentOfTotal) As ItemRank 
    From Employees 
    ) 
    , Signatures As 
    (
    Select Month 
     , Sum(PercentOfTotal * Cast(Power(10, ItemRank) As bigint)) As SignatureValue 
    From EmployeeRanks 
    Group By Month 
    ) 
    , DistinctSignatures As 
    (
    Select Min(Month) As MinMonth, SignatureValue 
    From Signatures 
    Group By SignatureValue 
    ) 
Select E.Month, E.Employee, E.PercentOfTotal 
From Employees As E 
    Join DistinctSignatures As D 
     On D.MinMonth = E.Month 
+0

非常感谢 - 我认为这是所有答案的最普遍意义上的作品。对我而言,第1和第3月是相同的。最后,我不需要知道每个分布来自哪个月,只是有2个不同的分布,以及这些分布是什么。 – user366729 2010-06-15 15:00:26

2

我只是放在一起该解决方案 在写这个问题,这 似乎工作

我不认为它有效。在这里,我加入另外两个组(月= 4和5分别),我会认为是不同但结果是相同的,即一个月= 1和2只:

WITH employees_paid (Month, Employee, PercentOfTotal) 
AS 
(
SELECT 1, 'Alice', 0.25 
UNION ALL 
SELECT 1, 'Barbara', 0.65 
UNION ALL 
SELECT 1, 'Claire', 0.1 
UNION ALL 
SELECT 2, 'Alice', 0.25 
UNION ALL 
SELECT 2, 'Barbara', 0.5 
UNION ALL 
SELECT 2, 'Claire', 0.25 
UNION ALL 
SELECT 3, 'Alice', 0.25 
UNION ALL 
SELECT 3, 'Barbara', 0.65 
UNION ALL 
SELECT 3, 'Claire', 0.1 
UNION ALL 
SELECT 4, 'Barbara', 0.25 
UNION ALL 
SELECT 4, 'Claire', 0.65 
UNION ALL 
SELECT 4, 'Alice', 0.1 
UNION ALL 
SELECT 5, 'Diana', 0.25 
UNION ALL 
SELECT 5, 'Emma', 0.65 
UNION ALL 
SELECT 5, 'Fiona', 0.1 
), 
temp_ids (Month) 
AS 
(
SELECT DISTINCT MIN(Month) 
    FROM employees_paid 
    GROUP 
    BY PercentOfTotal 
) 
SELECT EMP.Month, EMP.Employee, EMP.PercentOfTotal 
    FROM employees_paid AS EMP 
     INNER JOIN temp_ids AS IDS 
      ON EMP.Month = IDS.Month 
GROUP 
    BY EMP.Month, EMP.Employee, EMP.PercentOfTotal; 
+0

好点 - 但是,就我而言,总是有固定数量的员工。每个分配将有相同的3名员工,不多也不少。我可以承担基于这种假设的捷径,但从一般意义上说,你是正确的 - 当新员工被引入时,这是行不通的。 – user366729 2010-06-15 15:04:55

+0

那么,如果你的解决方案适合你,那么它看起来就像这里给我的最好的东西;) – onedaywhen 2010-06-16 07:14:03