2014-03-03 73 views
7

我发现连续几天的许多stackoverflow QnAs。
仍然回答太短,我不明白是怎么回事。sql连续天

为具体,我会做一个模型(或表)
(我使用PostgreSQL,如果它有差别。)

CREATE TABLE work (
    id integer NOT NULL, 
    user_id integer NOT NULL, 
    arrived_at timestamp with time zone NOT NULL 
); 


insert into work(user_id, arrived_at) values(1, '01/03/2011'); 
insert into work(user_id, arrived_at) values(1, '01/04/2011'); 
  1. (最简单的形式)为给定的用户,我想找到最后连续的日期范围。

  2. (我的最终目标)对于一个给定的用户,我想找到他的连续工作日。
    如果他昨天来上班,他仍然(截至今天)有连续工作日的机会。所以我昨天连续几天给他看。
    但是如果他昨天错过了,他的连续日子是0还是1,这取决于他今天是否来了。

说今天是第8天。

3 * 5 6 7 * = 3 days (5 to 7) 
3 * 5 6 7 8 = 4 days (5 to 8) 
3 4 5 * 7 * = 1 day (7 to 7) 
3 * * * * * = 0 day 
3 * * * * 8 = 1 day (8 to 8) 
+1

有趣的问题......你可以请加表格的架构? –

+2

模式和样本数据(以'CREATE TABLE'和'INSERT's)和预期结果请。 –

+0

请添加真实的DDL +样本数据。请不要使用简写符号。 – joop

回答

2

这是我解决这个问题使用CTE

WITH RECURSIVE CTE(attendanceDate) 
AS 
(
    SELECT * FROM 
    (
     SELECT attendanceDate FROM attendance WHERE attendanceDate = current_date 
     OR attendanceDate = current_date - INTERVAL '1 day' 
     ORDER BY attendanceDate DESC 
     LIMIT 1 
    ) tab 
    UNION ALL 

    SELECT a.attendanceDate FROM attendance a 
    INNER JOIN CTE c 
    ON a.attendanceDate = c.attendanceDate - INTERVAL '1 day' 
) 
SELECT COUNT(*) FROM CTE; 

检查代码在SQL Fiddle

这里是如何查询工作:

  1. 它选择当前记录从attendance表。如果今天的战绩是不可用,则它
  2. 然后,它不断将递归地记录至少日期

前一天,如果你想不论何时是用户的最新出勤的选择最新的连续的日期范围(今天选择昨天的记录昨天或x天前),然后CTE的初始化部分必须由以下替换片段:

SELECT MAX(attendanceDate) FROM attendance 

[编辑] 这里是一个SQL查询拨弄它解决你的问题#1:SQL Fiddle

+0

你能给我原来的小提琴似乎解决了我的问题#1吗? (没有今天/昨天的考虑),以便我可以首先理解你的查询的基础知识? – eugene

+0

http://www.sqlfiddle.com/#!15/7016f/1 –

+0

如果用户每天可以多次出席一次,请参阅编辑 –

0
-- some data 
CREATE table dayworked (
     id SERIAL NOT NULL PRIMARY KEY 
     , user_id INTEGER NOT NULL 
     , arrived_at DATE NOT NULL 
     , UNIQUE (user_id, arrived_at) 
     ); 

INSERT INTO dayworked(user_id, arrived_at) VALUES 
(1, '2014-02-03') 
,(1, '2014-02-05') 
,(1, '2014-02-06') 
,(1, '2014-02-07') 
     -- 
,(2, '2014-02-03') 
,(2, '2014-02-05') 
,(2, '2014-02-06') 
,(2, '2014-02-07') 
,(2, '2014-02-08') 
     -- 
,(3, '2014-02-03') 
,(3, '2014-02-04') 
,(3, '2014-02-05') 
,(3, '2014-02-07') 
     -- 
,(5, '2014-02-08') 
     ; 

-- The query 
WITH RECURSIVE stretch AS (
     SELECT dw.user_id AS user_id 
       , dw.arrived_at AS first_day 
       , dw.arrived_at AS last_day 
       , 1::INTEGER AS nday 
     FROM dayworked dw 
     WHERE NOT EXISTS (-- Find start of chain: no previous day 
       SELECT * FROM dayworked nx 
       WHERE nx.user_id = dw.user_id 
       AND nx. arrived_at = dw.arrived_at -1 
       ) 
     UNION ALL 
     SELECT dw.user_id AS user_id 
       , st.first_day AS first_day 
       , dw.arrived_at AS last_day 
       , 1+st.nday AS nday 
     FROM dayworked dw -- connect to chain: previous day := day before this day 
     JOIN stretch st ON st.user_id = dw.user_id AND st.last_day = dw.arrived_at -1 
     ) 
SELECT * FROM stretch st 
WHERE (st.nday > 1 OR st.first_day = NOW()::date) -- either more than one consecutive dat or starting today 
AND NOT EXISTS (-- Only the most recent stretch 
     SELECT * FROM stretch nx 
     WHERE nx.user_id = st .user_id 
     AND nx.first_day > st.first_day 
     ) 
AND NOT EXISTS (-- omit partial chains 
     SELECT * FROM stretch nx 
     WHERE nx.user_id = st .user_id 
     AND nx.first_day = st.first_day 
     AND nx.last_day > st.last_day 
     ) 
     ; 

结果:

CREATE TABLE 
INSERT 0 14 
user_id | first_day | last_day | nday 
---------+------------+------------+------ 
     1 | 2014-02-05 | 2014-02-07 | 3 
     2 | 2014-02-05 | 2014-02-08 | 4 
(2 rows) 
0

您可以创建的范围类型的集合:

Create function sfunc (tstzrange, timestamptz) 
    returns tstzrange 
    language sql strict as $$ 
     select case when $2 - upper($1) <= '1 day'::interval 
       then tstzrange(lower($1), $2, '[]') 
       else tstzrange($2, $2, '[]') end 
    $$; 

Create aggregate consecutive (timestamptz) (
     sfunc = sfunc, 
     stype = tstzrange, 
     initcond = '[,]' 
); 

用的骨料与正确的顺序得到最后arrived_at的连续第二天范围:

Select user_id, consecutive(arrived_at order by arrived_at) 
    from work 
    group by user_id; 

    ┌─────────┬─────────────────────────────────────────────────────┐ 
    │ user_id │      consecutive      │ 
    ├─────────┼─────────────────────────────────────────────────────┤ 
    │  1 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ 
    │  2 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ 
    └─────────┴─────────────────────────────────────────────────────┘ 

在窗口函数中使用聚合函数:

Select *, 
     consecutive(arrived_at) 
       over (partition by user_id order by arrived_at) 
    from work; 

    ┌────┬─────────┬────────────────────────┬─────────────────────────────────────────────────────┐ 
    │ id │ user_id │  arrived_at  │      consecutive      │ 
    ├────┼─────────┼────────────────────────┼─────────────────────────────────────────────────────┤ 
    │ 1 │  1 │ 2011-01-03 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-03 00:00:00+02"] │ 
    │ 2 │  1 │ 2011-01-04 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-04 00:00:00+02"] │ 
    │ 3 │  1 │ 2011-01-05 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ 
    │ 4 │  2 │ 2011-01-06 00:00:00+02 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ 
    └────┴─────────┴────────────────────────┴─────────────────────────────────────────────────────┘ 

查询的结果中找到你所需要的:

With work_detail as (select *, 
      consecutive(arrived_at) 
        over (partition by user_id order by arrived_at) 
     from work) 
    select arrived_at, upper(consecutive) - lower(consecutive) as days 
     from work_detail 
      where user_id = 1 and upper(consecutive) != lower(consecutive) 
      order by arrived_at desc 
       limit 1; 

    ┌────────────────────────┬────────┐ 
    │  arrived_at  │ days │ 
    ├────────────────────────┼────────┤ 
    │ 2011-01-05 00:00:00+02 │ 2 days │ 
    └────────────────────────┴────────┘ 
0

你甚至可以不用递归CTE这样做:
generate_series()LEFT JOINrow_count()和最终LIMIT 1

1表示“今天”加上连续天数直到“昨天”:

SELECT count(*) -- 1/0 for "today" 
    + COALESCE((-- + optional count of consecutive days up until "yesterday" 
     SELECT ct 
     FROM (
      SELECT d.ct, count(w.arrived_at) OVER (ORDER BY d.ct) AS day_ct 
      FROM generate_series(1, 8) AS d(ct) -- maximum = 8 
      LEFT JOIN work w ON w.arrived_at >= current_date - d.ct 
          AND w.arrived_at < current_date - (d.ct - 1) 
          AND w.user_id = 1 -- given user 
     ) sub 
     WHERE ct = day_ct 
     ORDER BY ct DESC 
     LIMIT 1 
     ), 0) AS total 
FROM work 
WHERE arrived_at >= current_date -- no future timestamps 
AND user_id = 1     -- given user 

假设每天有0或1个条目。应该快。

为了获得最佳性能(本或CTE解决方案一样),你就必须像一个多列索引:

CREATE INDEX foo_idx ON work (user_id,arrived_at); 
+0

这会比CTE解决方案更快吗? – eugene

+0

@eugene:可能是的。考虑简化的更新。你可以在你的数据上运行'EXPLAIN ANALYZE'吗? –

+0

我还没有足够大的数据集。并且花了相当长的时间将答案转换为我的实际模式。 :( – eugene