2017-04-07 85 views
0

添加DATENAME()函数查询会导致重复行,尽管'distinct'。DATENAME导致'Distinct'被忽略

TREE - TreeId, CityId, DatePlanted 
WATER - WaterId, TreeId(fk), DateWatered 

表1是一对多的表2

在树表中的每一行表示一个树的栽培。水表是浇树的单一实例。一棵树每年浇水多次。你明白了。

我需要返回一个报告显示种植树木的数量,按月份和数量它被浇水的次数。

SELECT t.CityId 
     , COUNT(distinct t.TreeId) as 'Trees Planted' 
     , COUNT(w.TreeId) as 'Trees Watered'   
FROM TREE t 
JOIN WATER w ON t.TreeId = w.TreeId 
WHERE w.DateWatered between @Start AND @End 
GROUP BY t.CityId 

这工作正常。但是,当我尝试按月分组时,t.Treeid不再明显,所以树的数量太高。

SELECT t.CityId 
    , DATENAME(month, w.DateWatered) 
     , COUNT(distinct t.TreeId) as 'Trees Planted' 
     , COUNT(w.TreeId) as 'Trees Watered'   
FROM TREE t 
JOIN WATER w ON t.TreeId = w.TreeId 
WHERE w.DateWatered between @Start AND @End 
GROUP BY t.CityId, DATENAME(month, w.DateWatered) 

编辑:我发现为什么我得到重复,但没有如何解决它。如果2016年4月和2016年5月再次浇灌树木,我会收到2棵树的种植计数,2棵树浇水,应该是种植1棵树和2次浇水。如果我在没有返回日期的情况下执行第一个查询,我会得到正确的数字。因此,通过添加日期,即使我按年分组,然后按月分组,同一棵树上有两次浇水,它也显示树种两次。我目前正在调查CTE的使用,可能会将查询的每个部分分开。

+0

你有超过12个月的数据吗?有时几个月会重演 – HABO

+0

'Group by t.CityId,Datepart(month,w.DateWatered),Datepart(year,w.DateWatered)''而不是'DATENAME(month,w.DateWatered)' – TriV

+0

@habo - 是的,有很多年数据。这是为什么它是复制的,因为几个月?我如何解决它? – BattlFrog

回答

1
SELECT t.CityId 
     , ISNULL(DATENAME(month, w.DateWatered), DATENAME(month, t.DatePlanted)) 
     , (SELECT COUNT(tDistinct.TreeId) FROM TREE tDistinct 
     WHERE tDistinct.TreeId = t.TreeId AND DATENAME(month, tDistinct.DatePlanted) = DATENAME(month, t.DateWatered) AND t.DatePlanted between @Start AND @End) as 'Trees Planted' 
     , COUNT(w.TreeId) as 'Trees Watered'   
    FROM TREE t 
    JOIN WATER w ON t.TreeId = w.TreeId 
    WHERE w.DateWatered between @Start AND @End 
    GROUP BY t.CityId, DATENAME(month, w.DateWatered), DATENAME(month, t.DatePlanted) 

唯一的缺点是这里在没有树的地方一个月一棵树栽你的日期将是空的,所以我增加了一个检查浇灌的情景......不知道你的数据是什么样子所以忽略ISNULL检查有利于您的原始分组

编辑: 根据您的要求,我不认为CTE是必要的;根据您所提供的我已经改变了查询稍稍满足您的需求的附加信息:

`SELECT DATENAME(MONTH, myConsolidatedTree.DateAction) as myDate 
      ,(SELECT COUNT(*) 
       FROM TREE AS t 
      WHERE 
      DATENAME(MONTH, myConsolidatedTree.DateAction) = DATENAME(MONTH, t.DatePlanted) 
      ) as myNumberOfPlanted 
      ,(SELECT COUNT(*) 
       FROM WATER AS w 
      WHERE 
       DATENAME(MONTH, myConsolidatedTree.DateAction) = DATENAME(MONTH, w.DateWatered) 
        ) as myNumberOfWatered 

     FROM(
      SELECT t.DatePlanted as DateAction 
        ,t.TreeId as IdAction 
        ,'PLANTED' as TreeAction 
       FROM TREE t 

      UNION 

      SELECT w.DateWatered as DateAction 
        ,w.TreeId as IdAction 
        ,'WATERED' as TreeAction 
       FROM WATER w) as myConsolidatedTree 
    WHERE myConsolidatedTree.DateAction between @StartDate and @EndDate 
    GROUP BY DATENAME(MONTH, myConsolidatedTree.DateAction), DATEPART(MONTH, myConsolidatedTree.DateAction) 
    ORDER BY DATEPART(MONTH, myConsolidatedTree.DateAction)` 

虽然合并子查询包含比需要为这个问题,我离开了附加TreeId更多的信息和衍生TreeAction列有在您未来可能会遇到此需求。

1

这演示了如何将问题分解成公用表表达式(CTE)中的步骤。请注意,您可以将最后的select替换为注释select之一以查看中间结果。这是测试,调试或理解正在发生的事情的便捷方式。

你所面对的问题之一就是试图仅基于饮水日期总结数据。如果一棵树在一个没有浇水的月份里种植,那么它不会被计算在内。下面的代码分别总结了日期范围内的种植和供水情况,然后将它们组合成单个结果集。

-- Sample data. 
declare @Trees as Table (TreeId Int Identity, CityId Int, DatePlanted Date); 
declare @Waterings as Table (WateringId Int Identity, TreeId Int, DateWatered Date); 
insert into @Trees (CityId, DatePlanted) values 
    (1, '20160115'), (1, '20160118'), 
    (1, '20160308'), (1, '20160318'), (1, '20160118'), 
    (1, '20170105'), 
    (1, '20170205'), 
    (1, '20170401'), 
    (2, '20160113'), (2, '20160130'), 
    (2, '20170226'), (2, '20170227'), (2, '20170228'); 
insert into @Waterings (TreeId, DateWatered) values 
    (1, '20160122'), (1, '20160129'), (1, '20160210'), (1, '20160601'), 
    (5, '20160120'), (5, '20160127'), (5, '20160215'), (5, '20160301'), (5, '20160515'); 
select * from @Trees; 
select * from @Waterings; 

-- Combine the data. 
declare @StartDate as Date = '20100101', @EndDate as Date = '20200101'; 
with 
    -- Each tree with the year and month it was planted. 
    TreesPlanted as (
    select CityId, TreeId, 
     DatePart(year, DatePlanted) as YearPlanted, 
     DatePart(month, DatePlanted) as MonthPlanted 
     from @Trees 
     where @StartDate <= DatePlanted and DatePlanted <= @EndDate), 
    -- Tree plantings summarized by city, year and month. 
    TreesPlantedSummary as (
    select CityId, YearPlanted, MonthPlanted, Count(TreeId) as Trees 
     from TreesPlanted 
     group by CityId, YearPlanted, MonthPlanted), 
    -- Each watering and the year and month it occurred. 
    TreesWatered as (
    select CityId, W.TreeId, 
     DatePart(year, W.DateWatered) as YearWatered, 
     DatePart(month, W.DateWatered) as MonthWatered 
     from @Trees as T left outer join 
     @Waterings as W on W.TreeId = T.TreeId 
     where @StartDate <= W.DateWatered and W.DateWatered <= @EndDate), 
    -- Waterings summarized by city, year and month. 
    TreesWateredSummary as (
    select CityId, YearWatered, MonthWatered, 
     Count(distinct TreeId) as Trees, Count(TreeId) as Waterings 
     from TreesWatered 
     group by CityId, YearWatered, MonthWatered) 
    -- Combine the plantings and waterings for the specified period. 
    select Coalesce(TPS.CityId, TWS.CityId) as CityId, 
    Coalesce(TPS.YearPlanted, TWS.YearWatered) as Year, 
    Coalesce(TPS.MonthPlanted, TWS.MonthWatered) as Month, 
    Coalesce(TPS.Trees, 0) as TreesPlanted, 
    Coalesce(TWS.Trees, 0) as TreesWatered, 
    Coalesce(TWS.Waterings, 0) as Waterings 
    from TreesPlantedSummary as TPS full outer join 
     TreesWateredSummary as TWS on TWS.CityId = TPS.CityId and 
     TWS.YearWatered = TPS.YearPlanted and TWS.MonthWatered = TPS.MonthPlanted 
    order by CityId, Year, Month; 
-- Alternative queries for testing/debugging/understanding: 
-- select * from TreesPlantedSummary order by CityId, YearPlanted, MonthPlanted; 
-- select * from TreesWateredSummary order by CityId, YearWatered, MonthWatered; 

现在你想要在结果中包含缺失的月份(没有活动),呃?