2017-01-12 167 views
0

我有两个数据集,我想按照领土#进行合并......第一个数据集具有领土信息,包括领土#,第二个数据集具有领土#,但它们是横跨4个不同的栏目,分别为drug_terr1,drug_terr2,drug_terr3和drug_Terr4 ...我需要在所有4列进行合并,因为它们每个都有不同的地区#,我希望这些数字包含在我与数据集的合并中领土信息...我尝试了重命名,但没有工作,因为它只改变了第一列...有没有办法将所有这些数据合并,并通过领土#重命名,所以我可以做合并?按不同列名合并多个列的SAS数据集

最终会希望它看起来像这样,但需要从'terrfile'获得4列成为1列名为territory_nbr,所以我可以合并。

%let output = E:\Horizon\Adhoc\AH\; 
%let terrs =\\uslsasas1\E$\Horizon\IMS Processing\Weekly Data\20161230\Demo\; 
libname terrs "&terrs."; 
%let curr_process_wk = '12-30-2016'; 
%let curr_quarter =_q1; 
**0 Grab pskw; 
data pskw_data; 
set PSKW.PSKWMaster ; 
where week in ('12-16-2016','12-23-2016','12-30-2016','01-06-2017') and CopayType ="FBD" and FNRX=1 and pme_id in (46,42,55,38) and product in ('DUEXIS','VIMOVO','PENNSAID') 
and 
(COBPrimaryRejectCode1 in ('75','76') or COBPrimaryRejectCode2 in ('75', '76') or COBPrimaryRejectCode3 in ('75' , '76')); 
run; 
proc sort data=pskw_data; 
by imsid; 
run; 

** 01 Grab tbl HCP; 
proc sort data=ims.tblhcp (where = (week = &curr_process_wk.) keep = week imsid first_name last_name address1 address2 city state zip spec) 
      out = IMS_demo (drop = week); 
     by IMSID; 
run; 

** 02 Grab tbl terrs_by_imsid; 
data terrfile; 
set terrs.wd2_terrs_by_imsid&curr_quarter.; 
run; 

proc sort data = terrfile; 
by imsid; 
run; 
** 03 Grab tbl roster; 
data roster (keep = territorycode repname territoryname teamname); 
set ims.tblRoster; 
    repname = trim(left(FirstName))||" "||trim(left(LastName)); 
run; 
**04 link ; 
data combine_dbs; 
merge pskw_data (in=in1) 
ims.tblhcp (in=in2); 
by imsid; 
if in1; 
run; 
data territories; ***can't merge because territory code is not in terrfile, just 4 columns as I mentioned above***; 
merge terrfile (in=in1) 
roster (in=in2); 
by territorycode; 
if in2; 
run; 
+0

你可以显示你的数据看起来像atm吗? –

+0

你想从领土主文件(每个领土有一条记录的文件)中选择哪些字段?由于您想将其与您的事实表(最多四个地区代码)组合四次,因此您需要为每个字段设置四个名称,以存储最多四个不同的值。 – Tom

+0

我想从terrfiles中找到一个名为IMS_ID的东西,这样我就可以最终将其添加到我的名册数据集中。 – SQUISH

回答

1

您需要将事实表与查找表合并四次。假设您的地区标识符在您的查找表中被称为ID,您想从中选取IMS_ID。我们还假设您的事实表中的四个字段的名称分别为ID1-ID4

proc sql ; 
    create table want as 
    select a.* 
      , b.ims_id as ims_id1 
      , c.ims_id as ims_id2 
      , d.ims_id as ims_id3 
      , e.ims_id as ims_id4 
    from FACT a 
    left join LU b on a.id1=b.id 
    left join LU c on a.id2=c.id 
    left join LU d on a.id3=d.id 
    left join LU e on a.id4=e.id 
    ; 
quit; 

在您的例子,它看起来ROSTER是你FACT表,TERRFILES是你LU表。您的ID变量看起来像是名称TERRITORYCODE,至少在您的查找文件中。很难说ROSTER中四个变量的命名。