2016-01-21 26 views
-1

嗨stackoverflow社区,如何通过聚合添加组中最近的订单项?

我有一张表Sales,假设如下所示。

Customer  Revenue  State  Date 
David   $100   NY   2016-01-01 
David   $500   NJ   2016-01-03 
Fred   $200   CA   2016-01-01 
Fred   $200   CA   2016-01-02 

我写的客户所产生的收入的一个简单的查询。输出的回报为这样:

David  $600 
Fred  $400 

我想现在要做的就是添加最新的购买日期行与最新购买相关联的状态一起。

期望的结果:

David  $600  2016-01-03  NJ 
Fred  $400  2016-01-02  CA 

我想保持SQL代码尽可能干净。我也想避免对新查询执行JOIN,因为此查询可能开始变得复杂。有关如何这样做的任何想法?

+0

如果客户在同一最新日期在不同州有两笔交易,该怎么办? –

+0

@vkp对不起,我的实际使用案例将是独特的时间戳,比日期更精细。所以在'Dave'同时购买两件商品的情况下,他总是会从同一地点购买。 – mlh351

+1

[在每个GROUP BY组中选择第一行?]可能重复(http://stackoverflow.com/questions/3800551/select-first-row-in-each-group-by-group) –

回答

1

你可以做到这一点使用row_number()(或first_value())和有条件聚集:

select customer, sum(revenue), max(date), 
     max(case when seqnum = 1 then state end) as mostRecentState 
from (select s.*, 
      row_number() over (partition by customer order by date desc) as seqnum 
     from s 
    ) s 
group by customer; 
0

海兰,试试这个:

SELECT 
    S.CUSTOMER 
    ,S.TOTAL 
    ,M.DT 
    ,M.STATE 
FROM 
(
    --SUM 
    SELECT 
    CUSTOMER 
    ,SUM(REVENUE) AS TOTAL 
    FROM TB 
    GROUP BY 
    CUSTOMER 
) S 
INNER JOIN (
     --MAX DATE 
     SELECT 
      CUSTOMER 
      ,STATE 
      ,DT 
     FROM TB 
     WHERE (CUSTOMER, DT) IN (
           SELECT AUX.CUSTOMER ,MAX(AUX.DT) 
           FROM TB AUX 
           GROUP BY AUX.CUSTOMER 
          ) 
) M ON (S.CUSTOMER = M.CUSTOMER); 
0
SELECT Customer 
    , (SELECT SUM(Revenue) FROM #t WHERE Customer = xx.Customer) AS TotalRevenue 
    , Dt, STATE 
FROM #t xx 
WHERE Dt = (SELECT MAX(Dt) FROM #t WHERE Customer = xx.Customer) 
ORDER BY Customer 
0

大型数据集,我会避免使用ROW_NUMBER()功能由于性能问题。下面是我如何有良好的表现效果,过去做过一个样本:

SELECT DISTINCT 
    t.customer, 
    totals.totalRevenue, 
    t.state, 
    t.date 
FROM 
    @test AS t INNER JOIN (
     SELECT 
      customer, 
      SUM(revenue) AS totalRevenue, 
      MAX(date) AS maxDate 
     FROM 
      @test AS tSub 
     GROUP BY 
      customer 
    ) AS totals ON t.customer = totals.customer AND t.date = totals.maxDate 

如果客户和日期字段是每条记录的唯一的,你可以在上面取出DISTINCT条款。

尝试SET STATISTICS IO ONSET STATISTICS TIME ON执行您的查询之前 - 较少逻辑磁盘IO通常意味着更好的性能。

相关问题