sql连续天

我发现连续几天的许多stackoverflow QnAs。
仍然回答太短，我不明白是怎么回事。sql连续天

为具体，我会做一个模型（或表）
（我使用PostgreSQL，如果它有差别。）

CREATE TABLE work (
    id integer NOT NULL, 
    user_id integer NOT NULL, 
    arrived_at timestamp with time zone NOT NULL 
); 


insert into work(user_id, arrived_at) values(1, '01/03/2011'); 
insert into work(user_id, arrived_at) values(1, '01/04/2011');

（最简单的形式）为给定的用户，我想找到最后连续的日期范围。
（我的最终目标）对于一个给定的用户，我想找到他的连续工作日。
如果他昨天来上班，他仍然（截至今天）有连续工作日的机会。所以我昨天连续几天给他看。
但是如果他昨天错过了，他的连续日子是0还是1，这取决于他今天是否来了。

说今天是第8天。

3 * 5 6 7 * = 3 days (5 to 7) 
3 * 5 6 7 8 = 4 days (5 to 8) 
3 4 5 * 7 * = 1 day (7 to 7) 
3 * * * * * = 0 day 
3 * * * * 8 = 1 day (8 to 8)

来源

2014-03-03 eugene

有趣的问题......你可以请加表格的架构？ –

模式和样本数据（以'CREATE TABLE'和'INSERT's）和预期结果请。 –

请添加真实的DDL +样本数据。请不要使用简写符号。 – joop

这是我解决这个问题使用CTE

WITH RECURSIVE CTE(attendanceDate) 
AS 
(
    SELECT * FROM 
    (
     SELECT attendanceDate FROM attendance WHERE attendanceDate = current_date 
     OR attendanceDate = current_date - INTERVAL '1 day' 
     ORDER BY attendanceDate DESC 
     LIMIT 1 
    ) tab 
    UNION ALL 

    SELECT a.attendanceDate FROM attendance a 
    INNER JOIN CTE c 
    ON a.attendanceDate = c.attendanceDate - INTERVAL '1 day' 
) 
SELECT COUNT(*) FROM CTE;

检查代码在SQL Fiddle

这里是如何查询工作：

它选择当前记录从attendance表。如果今天的战绩是不可用，则它
然后，它不断将递归地记录至少日期

前一天，如果你想不论何时是用户的最新出勤的选择最新的连续的日期范围（今天选择昨天的记录昨天或x天前），然后CTE的初始化部分必须由以下替换片段：

SELECT MAX(attendanceDate) FROM attendance

[编辑] 这里是一个SQL查询拨弄它解决你的问题＃1：SQL Fiddle

来源

2014-03-03 09:01:44

你能给我原来的小提琴似乎解决了我的问题＃1吗？（没有今天/昨天的考虑），以便我可以首先理解你的查询的基础知识？ – eugene

http://www.sqlfiddle.com/#!15/7016f/1 –

如果用户每天可以多次出席一次，请参阅编辑 –

-- some data 
CREATE table dayworked (
     id SERIAL NOT NULL PRIMARY KEY 
     , user_id INTEGER NOT NULL 
     , arrived_at DATE NOT NULL 
     , UNIQUE (user_id, arrived_at) 
     ); 

INSERT INTO dayworked(user_id, arrived_at) VALUES 
(1, '2014-02-03') 
,(1, '2014-02-05') 
,(1, '2014-02-06') 
,(1, '2014-02-07') 
     -- 
,(2, '2014-02-03') 
,(2, '2014-02-05') 
,(2, '2014-02-06') 
,(2, '2014-02-07') 
,(2, '2014-02-08') 
     -- 
,(3, '2014-02-03') 
,(3, '2014-02-04') 
,(3, '2014-02-05') 
,(3, '2014-02-07') 
     -- 
,(5, '2014-02-08') 
     ; 

-- The query 
WITH RECURSIVE stretch AS (
     SELECT dw.user_id AS user_id 
       , dw.arrived_at AS first_day 
       , dw.arrived_at AS last_day 
       , 1::INTEGER AS nday 
     FROM dayworked dw 
     WHERE NOT EXISTS (-- Find start of chain: no previous day 
       SELECT * FROM dayworked nx 
       WHERE nx.user_id = dw.user_id 
       AND nx. arrived_at = dw.arrived_at -1 
       ) 
     UNION ALL 
     SELECT dw.user_id AS user_id 
       , st.first_day AS first_day 
       , dw.arrived_at AS last_day 
       , 1+st.nday AS nday 
     FROM dayworked dw -- connect to chain: previous day := day before this day 
     JOIN stretch st ON st.user_id = dw.user_id AND st.last_day = dw.arrived_at -1 
     ) 
SELECT * FROM stretch st 
WHERE (st.nday > 1 OR st.first_day = NOW()::date) -- either more than one consecutive dat or starting today 
AND NOT EXISTS (-- Only the most recent stretch 
     SELECT * FROM stretch nx 
     WHERE nx.user_id = st .user_id 
     AND nx.first_day > st.first_day 
     ) 
AND NOT EXISTS (-- omit partial chains 
     SELECT * FROM stretch nx 
     WHERE nx.user_id = st .user_id 
     AND nx.first_day = st.first_day 
     AND nx.last_day > st.last_day 
     ) 
     ;

结果：

CREATE TABLE 
INSERT 0 14 
user_id | first_day | last_day | nday 
---------+------------+------------+------ 
     1 | 2014-02-05 | 2014-02-07 | 3 
     2 | 2014-02-05 | 2014-02-08 | 4 
(2 rows)

来源

2014-03-03 09:15:17 joop

您可以创建的范围类型的集合：

Create function sfunc (tstzrange, timestamptz) 
    returns tstzrange 
    language sql strict as $$ 
     select case when $2 - upper($1) <= '1 day'::interval 
       then tstzrange(lower($1), $2, '[]') 
       else tstzrange($2, $2, '[]') end 
    $$; 

Create aggregate consecutive (timestamptz) (
     sfunc = sfunc, 
     stype = tstzrange, 
     initcond = '[,]' 
);

用的骨料与正确的顺序得到最后arrived_at的连续第二天范围：

Select user_id, consecutive(arrived_at order by arrived_at) 
    from work 
    group by user_id; 

    ┌─────────┬─────────────────────────────────────────────────────┐ 
    │ user_id │      consecutive      │ 
    ├─────────┼─────────────────────────────────────────────────────┤ 
    │  1 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ 
    │  2 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ 
    └─────────┴─────────────────────────────────────────────────────┘

在窗口函数中使用聚合函数：

Select *, 
     consecutive(arrived_at) 
       over (partition by user_id order by arrived_at) 
    from work; 

    ┌────┬─────────┬────────────────────────┬─────────────────────────────────────────────────────┐ 
    │ id │ user_id │  arrived_at  │      consecutive      │ 
    ├────┼─────────┼────────────────────────┼─────────────────────────────────────────────────────┤ 
    │ 1 │  1 │ 2011-01-03 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-03 00:00:00+02"] │ 
    │ 2 │  1 │ 2011-01-04 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-04 00:00:00+02"] │ 
    │ 3 │  1 │ 2011-01-05 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ 
    │ 4 │  2 │ 2011-01-06 00:00:00+02 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ 
    └────┴─────────┴────────────────────────┴─────────────────────────────────────────────────────┘

查询的结果中找到你所需要的：

With work_detail as (select *, 
      consecutive(arrived_at) 
        over (partition by user_id order by arrived_at) 
     from work) 
    select arrived_at, upper(consecutive) - lower(consecutive) as days 
     from work_detail 
      where user_id = 1 and upper(consecutive) != lower(consecutive) 
      order by arrived_at desc 
       limit 1; 

    ┌────────────────────────┬────────┐ 
    │  arrived_at  │ days │ 
    ├────────────────────────┼────────┤ 
    │ 2011-01-05 00:00:00+02 │ 2 days │ 
    └────────────────────────┴────────┘

来源

2014-03-03 10:03:00

你甚至可以不用递归CTE这样做：
与generate_series()，LEFT JOIN，row_count()和最终LIMIT 1：

1表示“今天”加上连续天数直到“昨天”：

SELECT count(*) -- 1/0 for "today" 
    + COALESCE((-- + optional count of consecutive days up until "yesterday" 
     SELECT ct 
     FROM (
      SELECT d.ct, count(w.arrived_at) OVER (ORDER BY d.ct) AS day_ct 
      FROM generate_series(1, 8) AS d(ct) -- maximum = 8 
      LEFT JOIN work w ON w.arrived_at >= current_date - d.ct 
          AND w.arrived_at < current_date - (d.ct - 1) 
          AND w.user_id = 1 -- given user 
     ) sub 
     WHERE ct = day_ct 
     ORDER BY ct DESC 
     LIMIT 1 
     ), 0) AS total 
FROM work 
WHERE arrived_at >= current_date -- no future timestamps 
AND user_id = 1     -- given user

假设每天有0或1个条目。应该快。

为了获得最佳性能（本或CTE解决方案一样），你就必须像一个多列索引：

CREATE INDEX foo_idx ON work (user_id,arrived_at);

来源

2014-03-03 11:27:26

这会比CTE解决方案更快吗？ – eugene

@eugene：可能是的。考虑简化的更新。你可以在你的数据上运行'EXPLAIN ANALYZE'吗？ –

我还没有足够大的数据集。并且花了相当长的时间将答案转换为我的实际模式。 :( – eugene

回答

相关问题