2013-05-08 40 views
2

最新值我有一个跟踪客户分布中的变化的表。 Here's a simplified version行与客户和月

CREATE TABLE HISTORY (
    CUSTOMER_ID NUMBER(9,0), 
    DATE_CHANGED DATE, 
    ACCOUNT_TYPE VARCHAR2(20), 

    CONSTRAINT HISTORY_PK PRIMARY KEY (CUSTOMER_ID, DATE_CHANGED) 
); 

INSERT INTO HISTORY (CUSTOMER_ID, DATE_CHANGED, ACCOUNT_TYPE) VALUES (200, TO_DATE('05/01/2013 00:00:00','DD/MM/RRRR HH24:MI:SS'), 'Premium'); 
INSERT INTO HISTORY (CUSTOMER_ID, DATE_CHANGED, ACCOUNT_TYPE) VALUES (300, TO_DATE('17/02/2013 00:00:00','DD/MM/RRRR HH24:MI:SS'), 'Free'); 
INSERT INTO HISTORY (CUSTOMER_ID, DATE_CHANGED, ACCOUNT_TYPE) VALUES (100, TO_DATE('05/03/2013 00:00:00','DD/MM/RRRR HH24:MI:SS'), 'Free'); 
INSERT INTO HISTORY (CUSTOMER_ID, DATE_CHANGED, ACCOUNT_TYPE) VALUES (100, TO_DATE('12/03/2013 00:00:00','DD/MM/RRRR HH24:MI:SS'), 'Standard'); 
INSERT INTO HISTORY (CUSTOMER_ID, DATE_CHANGED, ACCOUNT_TYPE) VALUES (200, TO_DATE('22/03/2013 00:00:00','DD/MM/RRRR HH24:MI:SS'), 'Standard'); 
INSERT INTO HISTORY (CUSTOMER_ID, DATE_CHANGED, ACCOUNT_TYPE) VALUES (100, TO_DATE('29/03/2013 00:00:00','DD/MM/RRRR HH24:MI:SS'), 'Premium'); 

该数据由第三方维护。我的最终目标是在给定的时间范围内获得每个帐户类型和月份的客户总数,但现在,我想从简单的事情开始 - 显示记录更改时每个月/客户组合的最新帐户类型:

YEAR MONTH CUSTOMER_ID ACCOUNT_TYPE 
==== ===== =========== ============ 
2013  1   200 Premium 
2013  2   300 Free 
2013  3   100 Premium 
2013  3   200 Standard 

在这里,客户100在3月份做了三次更改;我们显示“高级”,因为它有3月份的最新日期。

查询获得所有行会是这样:

SELECT EXTRACT(YEAR FROM DATE_CHANGED) AS YEAR, 
EXTRACT(MONTH FROM DATE_CHANGED) AS MONTH, 
CUSTOMER_ID, ACCOUNT_TYPE 
FROM HISTORY 
ORDER BY YEAR, MONTH, CUSTOMER_ID, DATE_CHANGED 

是否可以过滤掉使用聚合函数不需要行?使用分析函数更有意义吗?

(并且,在任何一种情况下,这将是足够的功能?)

编辑:我一直在问不需要的行的例子。还有三月是3行客户100:

'05/03/2013 00:00:00', 'Free' 
'12/03/2013 00:00:00', 'Standard' 
'29/03/2013 00:00:00', 'Premium' 

不需要的行是'Free''Standard',因为他们不是在当月的最新产品。

+0

你能给“不需要行”的例子吗? – Lokesh 2013-05-08 09:07:26

+0

@loki - 好的,看我的编辑。 – 2013-05-08 09:14:29

回答

2
SELECT YEAR 
     ,MONTH 
     ,customer_id 
     ,max(ACCOUNT_TYPE) keep(dense_rank FIRST ORDER BY date_changed DESC) LAST_ACC 
FROM (
    SELECT EXTRACT(YEAR FROM DATE_CHANGED) AS YEAR, 
     EXTRACT(MONTH FROM DATE_CHANGED) AS MONTH, 
     CUSTOMER_ID, 
     date_changed, 
     account_type 
    FROM HISTORY 
) 
GROUP BY YEAR, MONTH, customer_id 
ORDER BY YEAR, MONTH, CUSTOMER_ID 


| YEAR | MONTH | CUSTOMER_ID | LAST_ACC | 
----------------------------------------- 
| 2013 |  1 |   200 | Premium | 
| 2013 |  2 |   300 |  Free | 
| 2013 |  3 |   100 | Premium | 
| 2013 |  3 |   200 | Standard | 

http://sqlfiddle.com/#!4/e493a/15

+0

看起来不错。如果我理解你的代码,'max(ACCOUNT_TYPE)'获取'DATE_CHANGED'最大的'ACCOUNT_TYPE'(不是'ACCOUNT_TYPE'的最大值)。我对么? – 2013-05-08 09:18:05

+0

是的,它只对当前年/月/客户组中具有最大date_changed的行取最大值,其余行将被忽略。 – 2013-05-08 09:21:51

+0

而且因为我们每个组只有一个最大日期(根据PK),即每组只有一行,我们也可以使用'MIN()',并得到相同的结果。谢谢,我有这样的想法,但无法把它们拼凑在一起。 – 2013-05-08 09:28:27

1
SELECT YEAR, MONTH, CUSTOMER_ID, ACCOUNT_TYPE 
FROM 
(
SELECT EXTRACT(YEAR FROM DATE_CHANGED) AS YEAR, 
     EXTRACT(MONTH FROM DATE_CHANGED) AS MONTH, 
     CUSTOMER_ID, 
     ACCOUNT_TYPE, 
     ROW_NUMBER() OVER (PARTITION BY CUSTOMER_ID, 
             EXTRACT(YEAR FROM DATE_CHANGED), 
             EXTRACT(MONTH FROM DATE_CHANGED) 
          ORDER BY EXTRACT(YEAR FROM DATE_CHANGED) DESC, 
            EXTRACT(MONTH FROM DATE_CHANGED) DESC, 
            DATE_CHANGED DESC) RN 
FROM HISTORY 
) 
WHERE RN = 1 
ORDER BY YEAR, MONTH, CUSTOMER_ID 

OUTPUT

╔══════╦═══════╦═════════════╦══════════════╗ 
║ YEAR ║ MONTH ║ CUSTOMER_ID ║ ACCOUNT_TYPE ║ 
╠══════╬═══════╬═════════════╬══════════════╣ 
║ 2013 ║  1 ║   200 ║ Premium  ║ 
║ 2013 ║  2 ║   300 ║ Free   ║ 
║ 2013 ║  3 ║   100 ║ Premium  ║ 
║ 2013 ║  3 ║   200 ║ Standard  ║ 
╚══════╩═══════╩═════════════╩══════════════╝ 
+0

如果我理解的代码,'PARTITION BY'定义了一个行子集,其中'DATE_CHANGED'是独一无二的(按照PK)。然后,在'OVER'子句中执行'ORDER BY DATE_CHANGED DESC'就足够了,不是吗? – 2013-05-08 10:16:39

2
SELECT DISTINCT 
CUSTOMER_ID, 
EXTRACT(YEAR FROM DATE_CHANGED) AS YEAR, 
EXTRACT(MONTH FROM DATE_CHANGED) AS MONTH, 
LAST_VALUE(ACCOUNT_TYPE) 
OVER(PARTITION BY CUSTOMER_ID,TO_CHAR(DATE_CHANGED,'YYYY-MM') ORDER BY DATE_CHANGED ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS ACCOUNT_TYPE 
FROM HISTORY 



CUSTOMER_ID YEAR MONTH ACCOUNT_TYPE 
200   2013 1 Premium 
300   2013 2 Free 
100   2013 3 Premium 
200   2013 3 Standard 

http://www.sqlfiddle.com/#!4/fab60/13

+0

'LAST_VALUE()'具有我想要做的确切语义,“返回有序值集合中的最后一个值”,并且我喜欢它。很遗憾,它不会自行删除行:) – 2013-05-08 10:22:49

+0

@ÁlvaroG。Vicario是的,它不能自行删除行,但也许在其他情况下,这个功能将是有用的。 :)。 – Gentlezerg 2013-05-08 12:49:40