2017-07-07 85 views
-1

我得到了大量采用了与许多表左侧加入一个代码。当我运行这段代码时,运行需要一个多小时,最后它会给排序执行失败带来错误。所以,我想通过多个步骤来分解左连接,但我不知道如何去做,需要你的帮助。打破多个LEFT JOIN在多个步骤在PROC的Sql

的代码是:

Proc sql; 
create table newlib.Final_test as 
SELECT 
POpener.Name as Client, 
Popener.PartyId as Account_Number, 
Case 
    When BalLoc.ConvertedRefNo NE '' then BalLoc.ConvertedRefNo 
else BalLoc.Ourreferencenum 
End as LC_Number, 
BalLoc.OurReferenceNum , 
BalLoc.CnvLiabilityCode as Liability_Code, 
POfficer.PartyID as Officer_Num, 
POfficer.Name as Officer_Name, 
POpener.ExpenseCode, 
BalLoc.IssueDate as Issue_Date format=mmddyy10., 
BalLoc.ExpirationDate AS Expiry format=mmddyy10., 
BalLoc.LiabilityAmountBase as Total_LC_Balance, 
Case 
    When BalLoc.Syndicated = 0 Then BalLoc.LiabilityAmountBase 
    else 0 
End as SunTrust_Non_Syndicated_Exposure, 
Case 
    When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey NE 0 Then  
BalLoc.LiabilityAmountBase 
    else 0 
    End as SunTrust_Syndicated_Exposure, 
Case 
    When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey NE 0 Then 
(BalLoc.LiabilityAmountBase - (BalLoc.LiabilityAmountBase * 
(PParty.ParticipationPercent/100))) 
    Else BalLoc.LiabilityAmountBase 
End as SunTrust_Exposure, 
Case 
    When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey <> 0 Then 
(BalLoc.LiabilityAmountBase * PParty.ParticipationPercent/100) 
    Else 0 
End as Exposure_Held_By_Other_Banks, 
PBene.Name as Beneficiary_Trustee, 
cat(put(input(POpener.ObligorNumber,best10.),z10.),put(input 

    (BalLoc.CommitmentNumber,best10.),Z10.)) as Key, 
case 
when BalLoc.BeneCusip2 NE ' ' then catx 
('|',Balloc.BeneCusip,Balloc.BeneCusip2) 
else BalLoc.BeneCusip 
End as Cusip, 
Case 
    when balLoc.OKtoExpire = 1 then '0' 
    when balLOc.OKtoExpire=0 and BalLoc.AutoExtTermDays NE 0 then put 
(Balloc.AutoExtTermDays,z3.) 
    when balLoc.OKtoExpire=0 and BalLoc.AutoExtTermsMonth NE 0 then put 
(balloc.AutoExtTermsMonth,z3.) 
    else '000' 
End as Evergreen 
Case 
when blf.AnnualRate NE 0 then put(blf.AnnualRate,z7.) 
when blf.Amount NE 0 then cats('F',put(blf.amount,z7.)) 
else 'WAIVE' 
End as Pricing, 

FROM BalLocPrimary BalLoc 
Left JOIN Party POpener on POpener.Pkey = BalLoc.OpenerPkey 
Left join PartGroup PGroup on BallOC.PartOutGroupPkey = PGroup.pKey 
Left join PartParties PParty ON PGroup.pKey = PParty.PartGroupPkey and 
PParty.ParticipationPercent > 0 and 
PParty.combined in 
(select PPartParties.All_combined 
from PPartParties /*group by PartGroupPkey, PartyPkey*/) 

Left Join MemExpenseCodes ExpCodes on POpener.ExpenseCode = ExpCodes.Code 
Left JOIN Party PBene on PBene.Pkey = BalLoc.BenePkey 
Left join Party POfficer on POfficer.Pkey = BalLoc.AccountOfficerPkey 
left join maxfee on maxfee.LocPrimaryPkey = BalLoc.LocPrimaryPkey 
left join BalLocFee BLF on BLF.Pkey = maxfee.pkey 
Where BalLoc.LetterType not in ('STBA','EXPA', 'FEE',' ') and 
BalLoc.LiabilityAmountBase > 0 and BalLoc.irdb = 1 
; 
quit; 

谢谢

桑卡

+0

很难说如何改进它没有一些统计数据。这些表有多大?它们是否在连接键上编入索引?查询的SELECT部分​​在哪里? – Joe

+0

@Joe;我只是添加了包含select语句的整个代码。这些表格的行数在75,000到650,000之间,列数在10到40之间。 – shankar

回答

0

有几件事情,我建议:

1,对于每个要引用的数据集,只保留你需要加入的变量,或者在SELECT语句中使用的变量。例如,从你的Party dset看来,你只需要Pkey字段和Name。因此,当你让你加入到DSET,你应该使用:

Left JOIN Party(keep=Pkey Name) PBene on PBene.Pkey = BalLoc.BenePkey 

2,把你的WHERE语句改成FROM语句,像这样:

FROM BalLocPrimary(where=(LetterType not in ('STBA','EXPA', 'FEE',' ') and 
LiabilityAmountBase > 0 and irdb = 1)) BalLoc 

,并确保条件的顺序(除了可能在这3个字段上的任何索引)

3,你正在开车离开BalLocPrimary数据集,离开了所有其他事物。这是你真正想要的吗?没有客户端或帐户号码,结果集是否可以返回?左连接可能在计算上花费很高,并且越多可以将它们最小化,越好。

4,乔问联接字段索引。你可能应该有一些。我发现自己经常引用this SUGI paper来标记它。同样,您可以查看查询中的EXPLAIN PLAN,看看它可能是瓶颈。 Another SUGI paper将是一个好的开始。

5,你说得对,这可以(应该?)分解成多个步骤。这是一个很好的直觉。但是,最佳休息时间将取决于底层数据,索引和联接路径。所以很难从屏幕的另一端规定。我认为我链接的第二篇论文可能会为您提供一些有关针对您的特定案例进行优化的好技巧。