1

我运行的PostgreSQL 9.3和有一个表,看起来是这样的:PostgreSQL:如何选择给定日期范围内每个帐户的最后余额?

 entry_date  | account_id | balance 
---------------------+------------+--------- 
2016-02-01 00:00:00 |  123 |  100 
2016-02-01 06:00:00 |  123 |  200 
2016-02-01 12:00:00 |  123 |  300 
2016-02-01 18:00:00 |  123 |  250 
2016-02-01 00:00:00 |  456 |  400 
2016-02-01 06:00:00 |  456 |  300 
2016-02-01 12:00:00 |  456 |  200 
2016-02-01 18:00:00 |  456 |  299 
2016-02-02 00:00:00 |  123 |  250 
2016-02-02 06:00:00 |  123 |  300 
2016-02-02 12:00:00 |  123 |  400 
2016-02-02 18:00:00 |  123 |  450 
2016-02-02 00:00:00 |  456 |  299 
2016-02-02 06:00:00 |  456 |  200 
2016-02-02 12:00:00 |  456 |  100 
2016-02-02 18:00:00 |  456 |  0 
(16 rows) 

我的目标是检索每个帐户在指定日期范围内最终余额,每一天。所以我期望的结果是:

 entry_date  | account_id | balance 
---------------------+------------+--------- 
2016-02-01 18:00:00 |  123 |  250 
2016-02-01 18:00:00 |  456 |  299 
2016-02-02 18:00:00 |  123 |  450 
2016-02-02 18:00:00 |  456 |  0 
(4 rows) 

注意,在我的例子中的时间戳是比现实更整洁......我不能总是依赖于18:00为每一天的最后一次。

我该如何编写这个SQL查询?

我想这个变化:

SELECT max(entry_date), account_id, max(balance) 
FROM ledger 
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp 
GROUP BY account_id, entry_date; 

这里是架构:

CREATE TABLE ledger (
    entry_date timestamp(3), 
    account_id int, 
    balance  int 
); 

INSERT INTO ledger VALUES ('2016-02-01T00:00:00.000Z', 123, 100); 
INSERT INTO ledger VALUES ('2016-02-01T06:00:00.000Z', 123, 200); 
INSERT INTO ledger VALUES ('2016-02-01T12:00:00.000Z', 123, 300); 
INSERT INTO ledger VALUES ('2016-02-01T18:00:00.000Z', 123, 250); 

INSERT INTO ledger VALUES ('2016-02-01T00:00:00.000Z', 456, 400); 
INSERT INTO ledger VALUES ('2016-02-01T06:00:00.000Z', 456, 300); 
INSERT INTO ledger VALUES ('2016-02-01T12:00:00.000Z', 456, 200); 
INSERT INTO ledger VALUES ('2016-02-01T18:00:00.000Z', 456, 299); 

INSERT INTO ledger VALUES ('2016-02-02T00:00:00.000Z', 123, 250); 
INSERT INTO ledger VALUES ('2016-02-02T06:00:00.000Z', 123, 300); 
INSERT INTO ledger VALUES ('2016-02-02T12:00:00.000Z', 123, 400); 
INSERT INTO ledger VALUES ('2016-02-02T18:00:00.000Z', 123, 450); 

INSERT INTO ledger VALUES ('2016-02-02T00:00:00.000Z', 456, 299); 
INSERT INTO ledger VALUES ('2016-02-02T06:00:00.000Z', 456, 200); 
INSERT INTO ledger VALUES ('2016-02-02T12:00:00.000Z', 456, 100); 
INSERT INTO ledger VALUES ('2016-02-02T18:00:00.000Z', 456, 0); 

这里是一个SQL小提琴:http://sqlfiddle.com/#!15/56886

提前感谢!

回答

1

您可以使用ROW_NUMBERPARTITION BY

SELECT entry_date, account_id, balance 
FROM (
    SELECT entry_date, account_id, balance, 
     ROW_NUMBER() OVER (PARTITION BY account_id, entry_date::date 
          ORDER BY entry_date DESC) AS rn 
    FROM ledger 
    WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp) AS t 
WHERE t.rn = 1 

PARTITION BY创建自012以来每天account_id值的切片在铸造日期值后,也用于同一条款中。每个切片按照entry_date的降序排列,因此ROW_NUMBER = 1对应于当天的最后一个记录。

Demo here

1

在Postgres里,我觉得最简单的方法是distinct on

SELECT DISTINCT ON (account_id) l.* 
FROM ledger l 
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp 
ORDER BY account_id, entry_date DESC; 

DISTINCT ON排序基于ORDER BY密钥的数据。然后它会在ON列表中选择密钥的唯一值,并选择遇到的第一个值。

编辑:

究竟同样的想法适用于一个记录一天 - 我只是误解了原来的规定:

SELECT DISTINCT ON (account_id, date_trunc('day', entry_date)) l.* 
FROM ledger l 
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp 
ORDER BY account_id, date_trunc('day', entry_date), entry_date DESC; 
+0

我掏如何简洁,这是,但遗憾的是它只返回行的最后一天的时间范围。我需要在日期范围内的每一天为每个帐户设置一行(期待四行,得到两个)。 –

相关问题