2016-06-30 19 views
1

RDM所作出的购买客户:在PostgreSQL 9.5.3查询了谁已经在共同但有不连续的日期集

我有一个表如下形式的(“活动”):

customerID | date   | purchaseID 
----------------------------------------- 
1   | 2016-01-01  | 1 
2   | 2016-01-01  | 2 
3   | 2016-01-01  | 3 
2   | 2016-01-02  | 4 
1   | 2016-01-03  | 5 
2   | 2016-01-03  | 6 
3   | 2016-01-03  | 7 
1   | 2016-01-04  | 8 
2   | 2016-01-04  | 9 
3   | 2016-01-05  | 10 

从在此表中,我想查找所有在与customerID 1相同的日期进行购买的客户。客户购买历史记录需要与customerID 1完全重叠,但不一定仅限于此 - 在日期之外的额外购买都可以,但不应在最终结果中返回。

上述数据结果应该是:

customerID | date   | purchaseID 
----------------------------------------- 
2   | 2016-01-01  | 2 
2   | 2016-01-02  | 5 
2   | 2016-01-03  | 8 

目前,我通过在应用程序代码回路解决这一点,然后删除所有NULL结果,所以实际的SQL是:

SELECT customerID, 
     date, 
     purchaseID 
FROM activity 
WHERE customerID <> 1 
    AND date = %date% 

其中%date%是通过customerID 1进行购买的所有日期的迭代变量。这不是一个优雅的解决方案,对于大量采购(数百万人)或客户(数万人)而言,这种解决方案非常缓慢。欢迎大家提出意见。

感谢reading--

回答

0

一种方法是使用自联接和聚集:如果你想原始记录

select a.customerid 
from activity a join 
    activity a1 
    on a1.date = a.date and a1.customerid = 1 
where a1.customerid <> a.customerid 
group by a.customerID 
having count(distinct a1.date) = (select count(distinct date) from activity where customerID = 1) 

,你可以使用:

select a.* 
from activity a 
where a.customerId in (select a.customerid 
         from activity a join 
          activity a1 
          on a1.date = a.date and a1.customerid = 1 
         where a1.customerid <> a.customerid 
         group by a.customerID 
         having count(distinct a1.date) = (select count(distinct date) from activity where customerID = 1) 
        ); 
+0

感谢您的回复 - 确实会返回正确的客户ID,但由于GROUP BY,我不确定如何返回purchaseID和日期。有什么建议么? –

+0

我花了一些时间来测试它,并且可以验证它是否有效,但是我不明白HAVING子句是如何操作的 - 它是仅通过count来匹配,还是也考虑了值?任何澄清表示赞赏 - –

+0

@ H.Corbett。 。 。对于每个客户,“有”条款计算与客户1相匹配的日期的数量与客户1的总日期相匹配。 –

0

您可以使用“包含”@>阵列算子:

with activity (customerID, date, purchaseID) AS (
    values (1, '2016-01-01'::date, 1), (2, '2016-01-01', 2), (3, '2016-01-01', 3), 
      (2, '2016-01-02', 4), (1, '2016-01-03', 5), (2, '2016-01-03', 6), 
      (3, '2016-01-03', 7), (1, '2016-01-04', 8), (2, '2016-01-04', 9), 
      (3, '2016-01-05', 10)) 
select customerID 
from activity 
group by customerID 
having customerID <> 1 AND 
     array_agg(date) @> array(select date from activity where customerID = 1)