2014-06-10 64 views
2

我正在处理将历史记录与预测相结合的存储过程。我有一个列(PHP),它指定投影是否优先于历史记录。我还有一个列,指定数据是否来自历史记录或投影表。我的存储过程的输出是这样的:优先消除重复行

CaseId  Year  Projection PHP Gas Oil 
    1  2004   0   1  
    1  2005   0   1  
    1  2005   1   1  
    1  2006   1   1  
    1  2007   1   1  
    1  2008   1   1  
    1  2009   1   1  
    2  2003   0   0  
    2  2004   0   0  
    2  2005   0   0  
    2  2005   1   0  
    2  2006   1   0  
    2  2007   1   0  
    2  2008   1   0  
    2  2006   1   0  

在这个例子中,我需要消除在第二排,因为CaseId 1个投影具有优先,因此重叠的历史日期应该被删除。另外,CaseId 2的第四行应该被删除,因为历史记录具有优先权。

CaseId  Year  Projection PHP Gas Oil 
    1  2004   0   1  
    1  2005   1   1  
    1  2006   1   1  
    1  2007   1   1  
    1  2008   1   1  
    1  2009   1   1  
    2  2003   0   0  
    2  2004   0   0  
    2  2005   0   0  
    2  2006   1   0  
    2  2007   1   0  
    2  2008   1   0  
    2  2006   1   0 

我需要标记CaseId中的重复年份,然后比较投影和PHP列并删除它们不匹配的行。

SELECT  rcl.ReportRunCaseId AS CaseId, 
      year(rce.EcoDate) as Year, 
      1 as Projection, 
      cpq.ProjectionHasPrecedence as PHP, 
      rce.GrossOil as Oil,     
      rce.GrossGas as Gas  
    from phdreports.PhdRpt.ReportCaseList_28 rcl 
     inner join phdreports.PhdRpt.RptCaseEco_28 rce on 
      rce.ReportRunCaseId = rcl.ReportRunCaseId 
     inner join dbo.caseQualifier cq on 
      cq.CorpScenarioId = 1 and 
      cq.CaseCaseId = rcl.ReportRunCaseId and 
      cq.CorpQualifierTypeId = 1 
     inner join dbo.caseProjectionQualifier cpq on 
      cpq.CaseCaseId = rcl.ReportRunCaseId and 
      cpq.CorpQualifierId = cq.QualifierHasData 
where rcl.ReportRunCaseId <=2 
group by year(rce.EcoDate), rcl.ReportRunCaseId, cpq.ProjectionHasPrecedence, rce.GrossGas, rce.GrossOil 

union all 

select  rmp.ReportRunCaseId AS CaseId, 
      year(rmp.EcoDate) as Year, 
      0 as Projection, 
      cpq.ProjectionHasPrecedence as PHP, 
      rmp.GrossOil as Oil, 
      rmp.GrossGas as Gas    
from PhdReports.PhdRpt.RptMonthlyProduction_50 rmp 
     inner join dbo.caseQualifier cq on 
      cq.CorpScenarioId = 1 and 
      cq.CaseCaseId = rmp.ReportRunCaseId and 
      cq.CorpQualifierTypeId = 1 
     inner join dbo.caseProjectionQualifier cpq on 
      cpq.CaseCaseId = rmp.ReportRunCaseId and 
      cpq.CorpQualifierId = cq.QualifierHasData 
where rmp.ReportRunCaseId <= 2 
group by year(rmp.EcoDate), rmp.ReportRunCaseId, cpq.ProjectionHasPrecedence, rmp.GrossGas, rmp.GrossOil 

我怎样才能消除重复年,在那里投影和PHP不匹配:

下面是该查询我一起工作?

回答

2

的ROW_NUMBER()函数应该帮助你在这里:

WITH Data AS 
( SELECT  rcl.ReportRunCaseId AS CaseId, 
       year(rce.EcoDate) as Year, 
       1 as Projection, 
       cpq.ProjectionHasPrecedence as PHP, 
       rce.GrossOil as Oil,     
       rce.GrossGas as Gas  
     from phdreports.PhdRpt.ReportCaseList_28 rcl 
      inner join phdreports.PhdRpt.RptCaseEco_28 rce on 
       rce.ReportRunCaseId = rcl.ReportRunCaseId 
      inner join dbo.caseQualifier cq on 
       cq.CorpScenarioId = 1 and 
       cq.CaseCaseId = rcl.ReportRunCaseId and 
       cq.CorpQualifierTypeId = 1 
      inner join dbo.caseProjectionQualifier cpq on 
       cpq.CaseCaseId = rcl.ReportRunCaseId and 
       cpq.CorpQualifierId = cq.QualifierHasData 
    where rcl.ReportRunCaseId <=2 
    group by year(rce.EcoDate), rcl.ReportRunCaseId, cpq.ProjectionHasPrecedence, rce.GrossGas, rce.GrossOil 

    union all 

    select  rmp.ReportRunCaseId AS CaseId, 
       year(rmp.EcoDate) as Year, 
       0 as Projection, 
       cpq.ProjectionHasPrecedence as PHP, 
       rmp.GrossOil as Oil, 
       rmp.GrossGas as Gas    
    from PhdReports.PhdRpt.RptMonthlyProduction_50 rmp 
      inner join dbo.caseQualifier cq on 
       cq.CorpScenarioId = 1 and 
       cq.CaseCaseId = rmp.ReportRunCaseId and 
       cq.CorpQualifierTypeId = 1 
      inner join dbo.caseProjectionQualifier cpq on 
       cpq.CaseCaseId = rmp.ReportRunCaseId and 
       cpq.CorpQualifierId = cq.QualifierHasData 
    where rmp.ReportRunCaseId <= 2 
    group by year(rmp.EcoDate), rmp.ReportRunCaseId, cpq.ProjectionHasPrecedence, rmp.GrossGas, rmp.GrossOil 
), Data2 AS 
( SELECT *, 
      RowNum = ROW_NUMBER() OVER(PARTITION BY CaseId, Year 
             ORDER BY CASE WHEN PHP = Projection THEN 0 ELSE 1 END DESC, PHP DESC, Projection DESC) 
    FROM Data 
) 
SELECT CaseId, Year, Projection, PHP, Oil, Gas 
FROM Data2 
WHERE RowNum - 1; 

只考虑最后一位,作为第一个就是你的一个公共表表达式中查询:

RowNum = ROW_NUMBER() OVER(PARTITION BY CaseId, Year 
          ORDER BY CASE WHEN PHP = Projection THEN 0 ELSE 1 END DESC, PHP DESC, Projection DESC) 

在这里,我们给每个caseId,year元组排序,按照是否等于投影排序。然后最后一部分只是将结果限制为每个元组的第一行,所以如果一行存在于相等的地方,那么将会使用该行,如果没有行的地方它们相等,那么将使用它们不相等的行。

您可能需要为订单添加更多标准以确保结果是确定性的,也就是说,如果您在同一个caseId/Year中有两行,其中PHP和投影都是1,请确保同一行是每次挑选。

+0

我尝试过使用这种方法,它在PHP设置为0时返回投影行。 –

+0

谢谢。我从来不知道你可以按顺序使用一个案例。 –

1

我不知道你的查询与这个问题有什么关系。所以,让我假设你有一个查询,做:

select CaseId, Year, Projection, PHP, Gas, Oil 
from t 

有了这个,你可以做你想做使用row_number()什么:

select CaseId, Year, Projection, PHP, Gas, Oil 
from (select CaseId, Year, Projection, PHP, Gas, Oil, 
      row_number() over (partition by CaseId, Year 
           order by Projection + PHP desc 
           ) as seqnum 
     from t 
    ) t 
where seqnum = 1; 

这将基于标记数量优先行时,被设置。在你的CaseId = 2的例子中,两行包含相同的值。这将返回其中一行。如果您想在它们之间进行选择,则需要另一列,以便指定优先级。

+0

PHP是我应该使用的优先顺序。它可以是1或0.如果它是1,我应该使用投影行,否则它应该使用历史记录行。 –

+0

@RolandP。 。 。从我可以说,这是相同的逻辑。如果PHP是1,那么一个包含项目的行将有2的总和。所以这将是第一个。如果值为(1,0),(0,1),则选择任意行,因为问题没有指定哪一行。在所有情况下,只会选择一行。 –

+0

我已经使用您的代码和GarethD的代码的组合解决了这个问题。我改变了Projection + PHP的顺序,以防Projection = PHP,然后0,否则1结束,它的工作原理。谢谢你的例子。 –