2015-04-06 23 views
1

我有2个表,客户和CustomerActivity如下面的图片中显示,获得通过日期最新行:计数行并从多个表

enter image description here

我想输出表:

  • 具有来自Customer表的CustomerType ='Existing Customer'的所有列,再加上2列:
  • totalActivity(count activityID) - 显示每个客户的总活动数。
  • latestActivity(最大checkinTime) - 显示在最近一次活动的日期时间

到目前为止,我有这2个查询,但我不知道如何组合/加入和过滤它们以获得我所需要的。任何人都可以用1个查询帮助(有些解释是完美的)

SELECT customerId, firstName, birthDate, customerType 
FROM Customer 
WHERE Customer.customerType = 'Existing Customer' 

SELECT t1.activityId, t1.checkinTime, t1.customerId 
FROM CustomerActivity t1 
inner join (
    SELECT customerId, max(checkinTime) as Lastest 
    FROM CustomerActivity 
    group by customerId 
) t2 on t1.customerId = t2.customerId and t1.checkinTime = t2.Lastest 

回答

2

你实际上关闭。这是您的查询应该是什么样子:

SELECT 
    c.customerId, 
    c.firstName, 
    c.lastName, 
    c.birthDate, 
    c.customerType, 
    ca.totalActivity, 
    ca.latestActivity 
FROM Customer c 
INNER JOIN(
    SELECT 
     customerId, 
     latestActivity = MAX(checkinTime), 
     totalActivity = COUNT(*) 
    FROM CustomerActivity 
    GROUP BY customerId 
) ca 
    ON ca.customerId = c.customerId 
WHERE 
    c.customerType = 'Existing Customer' 

子查询(在INNER JOIN内部)使用COUNT(*),并使用每个客户的MAX(checkinTime)最新活动获取活动的总数。之后,您需要将其加入customerIdCustomer表。然后,您只需添加WHERE子句即可筛选'Existing Customer'

+0

它完美的作品! – 2015-04-06 02:19:27

+1

正如我在回答的评论中指出的那样,我相信由于子查询,这种方法效率会降低。我建议看看两者的执行计划,但我愿意打赌,这种方法将汇总每个客户的总体活动和最新活动,而直接加入表格会显示您仅针对“现有“ 顾客。唯一我不确定的是,如果这将被优化,但我不认为它会。如果您有200万客户,其中只有20万客户是现有客户,这可能会在性能上产生巨大差异。 – 2015-04-06 02:31:18

+0

@ScottSmith,同意。我不知道优化器是否足够聪明以优化它,只有现有的客户才会被聚合。我建议OP应该看执行计划。 – 2015-04-06 02:36:13

1

我还没有根据实际模式对其进行测试,但类似这样的情况应该可以工作(即使没有活动,这种方法也会显示客户,如果您只希望客户有活动,只需将左连接更改为内连接):使用row_number()和窗口福而不是

SELECT c.CustomerID 
    , c.FirstName 
    , c.BirthDate 
    , c.CustomerType 
    , COUNT(ca.ActivityID) AS TotalActivity 
    , MAX(ca.CheckinTime) AS MostRecentActivity 
FROM Customer c 
LEFT JOIN CustomerActivity ca ON c.CustomerID = ca.CustomerID 
WHERE c.CustomerType = 'Existing Customer' 
GROUP BY c.CustomerID 
    , c.FirstName 
    , c.BirthDate 
    , c.CustomerType 
+0

我相信提议的其他解决方案由于加入中的子查询会计算系统中每个客户的两个活动聚合,然后再将它们过滤到现有客户。而采用这种方法,只会招致现有客户记录的计算开销。 – 2015-04-06 02:18:16

1

你可以得到你想要的东西,而不group by,:

SELECT c.*, ca.numActivities, ca.activityId as LastActivity 
FROM Customer c JOIN 
    (select ca.*, 
      count(*) over (partition by ca.CustomerId) as numActivities 
      row_number() over (partition by ca.CustomerId order by checkinTime desc) as seqnum 
     from CustomerActivity ca 
    ) ca 
    on c.customerId = ca.customerId and ca.seqnum = 1 
WHERE c.customerType = 'Existing Customer'; 

这个版本将让你得到你从最近的活动排喜欢的任何列。

编辑:

在你原来的问题,我想你想的最新活动。如果你只是想在最新的日期时间,然后汇总工作:

SELECT c.*, ca.numActivities, ca.lastActivityDateTime 
FROM Customer c JOIN 
    (select ca.*, 
      count(*) as numActivities 
      max(checkinTime) as lastActivityDateTime 
     from CustomerActivity ca 
    ) ca 
    on c.customerId = ca.customerId 
WHERE c.customerType = 'Existing Customer'; 
+0

感谢这两个版本,真的很有帮助!是的,我想要datetime的最新版本,但是可以更详细地解释一下'over(ca.CustomerId分区)'部分,它有什么作用? – 2015-04-06 02:18:50

+0

这些是窗口功能。开始的好地方是文档:https://msdn.microsoft.com/en-us/library/ms189461.aspx。 – 2015-04-06 02:24:06

0
Select c.customerId, c.firstName, c.lastName, c.birthDate, c.customerType, gca.latestCheckIn, gca.count 
from customer as c, 
    (select ca.customerId, max(ca.checkInTime) as latestCheckIn, count(*) as checkinCount 
    from customerActivity as ca 
    group by ca.customerId) as gca 
where gca.customerId = c.customerId AND c.customerType = 'Existing Customer' 

如果你澄清更多的客户没有活动,就可以查询改为使用左连接