2017-03-02 36 views
-1

我有这样一个表:SQL/SAS:创建不存在的表?

ID GROUP VALUE 
201540 1 1000 
201540 2 1111 
201540 5 2000 
201550 1 200 
201550 8 400 
201610 4 990 
201610 5 400 
201610 6 777 
201610 7 222 
201610 8 6666 

什么,我需要做的是扩大表,所以每个ID有从1到8 我想创建一个表conaining缺少组8组每个ID,像这样:

ID GROUP VALUE 
201540 3 -1 
201540 4 -1 
201540 6 -1 
201540 7 -1 
201540 8 -1 
201550 2 -1 
201550 3 -1 
201550 4 -1 
201550 5 -1 
201550 6 -1 
201550 7 -1 
201610 1 -1 
201610 2 -1 
201610 3 -1 

我试着用

CREATE TABLE TMP AS 
SELECT ID, GROUP, -1 from table where not exists 
(SELECT * FROM table where ....) 

但我不知道如何使用where-clause ...

任何提示? 谢谢你,dbdb

+0

您正在使用哪些DBMS? Postgres的?甲骨文? –

回答

0

这是我在Oracle中尝试过的。逻辑是

  1. 生成1-8范围。 (seq)
  2. 在派生表中获取不同的ID。 (dist)
  3. 交叉连接上述2个输出以获得组和ID的所有可能组合。
  4. 使用此组合并与现有表连接并仅筛选现有表中缺少的那些记录。你可以用left join来完成。

这是Oracle中的示例代码。您可以在SAS中使用类似的方法。

with tbl(ID,GRP,VAL) as (
    select 201540,1,1000 from dual union all 
    select 201540,2,1111 from dual union all 
    select 201540,5,2000 from dual union all 
    select 201550,1,200 from dual union all 
    select 201550,8,400 from dual union all 
    select 201610,4,990 from dual union all 
    select 201610,5,400 from dual union all 
    select 201610,6,777 from dual union all 
    select 201610,7,222 from dual union all 
    select 201610,8,6666 from dual) 
    ,seq (seqno) as (
    select 1 from dual union all 
    select 2 from dual union all 
    select 3 from dual union all 
    select 4 from dual union all 
    select 5 from dual union all 
    select 6 from dual union all 
    select 7 from dual union all 
    select 8 from dual) 
    ,all_seq as 
    (select * from seq cross join (select distinct ID from tbl)) 
    select s.id,s.seqno as grp,-1 as val from all_seq s 
    left join tbl t 
    on s.id=t.id and s.seqno=t.grp 
    where t.id is null 

输出

ID  GRP VAL 
201540 4 -1 
201540 7 -1 
201540 6 -1 
201540 3 -1 
201540 8 -1 
201550 7 -1 
201550 4 -1 
201550 3 -1 
201550 5 -1 
201550 6 -1 
201550 2 -1 
201610 1 -1 
201610 3 -1 
201610 2 -1 
0

如果这是SAS那么它可能是最容易下来把问题分解成这样几个步骤:

  1. 得到,如果所有的独特列表ID(如果已经有这样的数据集,请跳过此步骤)

Proc sort data = original out = ID_list nodupkey;
by ID;
run;

  1. 展开ID列表数据集以包含每个ID的所有组。

    data expanded;
    set ID_list;
    do i = 1 to 8;
    group = i;
    value = - 1;
    output;
    end;
    run;

  2. 查找ID *组组合在原始不存在。

    proc sql;
    create table final as
    select L.*
    from expanded as L left join original as R
    on L.ID=R.ID and L.group = R.group
    where not(L.ID = R.ID and L.group = R.group);
    quit;

这应该给你所有的ID *组不存在于原始数据集 - 值1。

+0

非常感谢,是的,它在SAS。您的解决方案适用于我。 – derbestederbesten