2017-04-18 119 views
0

我有以下表,最大日期,并在同一个表中的最大日之前的日期选择记录

R_ID DATE Col_A Col_B Col_C 
158 20161008 01  01  99 
158 20161012 01  01  99 
158 20161019 01  02  10 
158 20161022 99  01  10 

我想选择这样,我得到以下结果

R_ID DATE Col_A Col_B Col_C 
158 20161008 01  01  99 
158 20161022 99  01  10 

的这里的逻辑是

1. 'select max date' for record with Col_C = '10' for a particular R_ID and 
2. When Col_A or Col_B = '01' and Col_C <> '10' select the minimum Date which is < Max_date used in 1st condition 

我使用愈合状况像下面

Select * from tbl1 T 
where 
T.Col_C = '10' and 
T.DATE = (select max(T2.DATE) from tbl1 T2 
           where 
           T2.Col_C = '10' and 
           T3.R_ID = T.R_ID 
     ) 

union 

Select * from tbl1 K 
where 
(K.Col_A = '01' or K.Col_B = '01') and 
K.Col_C <> '10' and 
K.DATE = (select min(K2.DATE) from tbl1 K2 where 
         (K2.Col_A = '01' or K2.Col_B = '01') and 
         K2.Col_C <> '10' and 
         K2.R_ID = K.R_ID 
     ) and 

--K.DATE < T.DATE-- How do I use this condition within union? 

我希望能够到注释中使用的条件,但我找不出正确的语法

我已经更新了状态

+2

您的规则和您的示例结果不匹配。 –

回答

0

这是一个强制的方法:

WITH aset 
    AS (SELECT 158 rid, DATE '2016-10-08' d, '01' col_a, '01' col_b, '99' col_c FROM DUAL 
     UNION ALL 
     SELECT 158, DATE '2016-10-12', '01', '01', '99' FROM DUAL 
     UNION ALL 
     SELECT 158, DATE '2016-10-19', '01', '02', '10' FROM DUAL 
     UNION ALL 
     SELECT 158, DATE '2016-10-22', '99', '01', '10' FROM DUAL) 
    , bset 
    AS ( SELECT rid, MAX (d) maxd_10 
      FROM aset 
      WHERE col_c = '10' 
     GROUP BY rid) 
    , cset 
    AS ( SELECT aset.rid, MIN (d) mind 
      FROM aset INNER JOIN bset ON aset.rid = bset.rid 
      WHERE aset.col_c <> '10' 
     GROUP BY aset.rid) 
SELECT aset.* 
    FROM aset 
     INNER JOIN bset 
      ON aset.rid = bset.rid 
      AND aset.d = bset.maxd_10 
      AND col_c = '10' 
UNION ALL 
SELECT aset.* 
    FROM aset 
     INNER JOIN cset 
      ON aset.rid = cset.rid 
      AND aset.d = cset.mind 
      AND aset.col_c <> '10'; 

结果:

RID D   COL_A COL_B COL_C 
158 10/22/2016 99 01 10  
158 10/08/2016 01 01 99  
+0

这不会帮助更大的设置。我正在处理超过一百万条记录 – Hemansh

+0

当您尝试它时,您会得到什么错误? –

0

分析函数可以对此有所帮助。

with dataset as (select 158 r_id, date '2016-10-08' value_date, '01' col_a, '01' col_b, '99' col_c from dual 
       union all 
       select 158, date '2016-10-12', '01', '01', '99' from dual 
       union all 
       select 158, date '2016-10-19', '01', '02', '10' from dual 
       union all 
       select 158, date '2016-10-22', '99', '01', '10' from dual) 

select d3.* 
from (select d2.*, 
       min(case when d2.col_c <> '10' and 
          '01' in (d2.col_a, d2.col_b) and 
          d2.value_date < d2.ceiling_date then d2.value_date 
         else null 
        end) over (partition by d2.r_id 
           order by d2.value_date 
           rows between unbounded preceding and unbounded following) smallest_date 
     from (select d.*, 
         max(case when d.col_c = '10' then d.value_date else null end) 
         over (partition by d.r_id 
          order by d.value_date 
          rows between unbounded preceding and unbounded following) ceiling_date 
       from dataset d) d2) d3 
where d3.ceiling_date is not null  -- ceiling_date contains null if we were unable to find row with '10' in COL_C for specified r_id - remove such rows from result set 
    and (d3.value_date = d3.ceiling_date -- here we presume that combination of r_id + value_date gives us unique row 
    or d3.value_date = d3.smallest_date); -- otherwise we should add some column(s) that help us decide what rows do we need (row_number() with custom order by clause may be helpful) 
相关问题