2016-03-07 56 views
1

我有一个产品表和一个更新日志表。产品表有各种类别(Cat 1,Cat 2,Cat3)和价格水平(Level1,Level2,Level3),我想对这些类别进行计数和分组。所以,我有两个表中的Mysql内部连接计数

SELECT products.category, 
COUNT(CASE WHEN products.price_level='1' THEN products.category END) as 'Level1', 
COUNT(CASE WHEN products.price_level='2' THEN products.category END) as 'Level2', 
COUNT(CASE WHEN products.price_level='3' THEN products.category END) as 'Level3' 
FROM products 
GROUP BY products.category 
ORDER BY COUNT(products.category) DESC 

结果是:

Category Level1 Level2 Level3 
Cat1  33  14  6 
Cat2  19  29  10 
Cat3  5  17  15 

到目前为止,good..this工作正常。

现在我想在带有productId字段的(changelog)中引入另一个表,该字段链接到products.id字段。它也有一个字段“状态”,值为Active,Inactive)。所以我想状态字段添加到表中显示有效的产品,如:

Category Level1 Level2 Level3 Active 
Cat1  33  14  6 
Cat2  19  29  10 
Cat3  5  17  15 

所以我这样做不工作:

SELECT products.category, 
COUNT(CASE WHEN products.price_level='1' THEN products.category END) as 'Level1', 
COUNT(CASE WHEN products.price_level='2' THEN products.category END) as 'Level2', 
COUNT(CASE WHEN products.price_level='3' THEN products.category END) as 'Level3', 
COUNT(CASE WHEN changelog.status='Active' THEN changelog.status END) as 'Active' 

FROM products 

LEFT JOIN changelog on products.id=changelog.productId 

GROUP BY products.category 
ORDER BY COUNT(products.category) DESC 

计数就会失控,因为它看来,类别计数可能会累积到更改日志表中的每个条目。这个查询有什么问题?

+0

产品涉及许多更新日志,反之亦然,所以表之间的cartesean人为地增加了计数。您需要获取加入之前生成的计数。 – xQbert

回答

0

您必须在包含多于一个关系的连接之前实现计数。

SELECT P.category, P.level1, p.level2, p.level3, 
COUNT(CASE WHEN changelog.status='Active' THEN changelog.status END) as 'Active' 
FROM (SELECT category, ID 
     COUNT(CASE WHEN price_level='1' THEN category END) as 'Level1', 
     COUNT(CASE WHEN price_level='2' THEN category END) as 'Level2', 
     COUNT(CASE WHEN price_level='3' THEN category END) as 'Level3' 
     FROM products 
     GROUP BY category, ID) P 
LEFT JOIN changelog 
    on p.id=changelog.productId 
ORDER BY COUNT(p.category) DESC 
1

您可以使用相关子查询此:

SELECT t.category, 
     COUNT(CASE WHEN t.price_level='1' THEN t.category END) as 'Level1', 
     COUNT(CASE WHEN t.price_level='2' THEN t.category END) as 'Level2', 
     COUNT(CASE WHEN t.price_level='3' THEN t.category END) as 'Level3', 
     (SELECT COUNT(CASE 
         WHEN c.status='Active' THEN c.status 
        END) 
     FROM changelog AS c 
     INNER JOIN products AS p ON p.id=c.productId 
     WHERE p.category = t.category) AS 'Active' 
FROM products AS t  
GROUP BY t.category 
ORDER BY COUNT(t.category) DESC 

的子查询返回的'Active'记载,关联到当前的产品类别的数量。

+0

该查询挂起,将mysqld推到100%的CPU。也许因为changelog表中有500k条记录? – lilbiscuit

+0

@lilbiscuit您的表格是否已正确编制索引? –

+0

表正确索引?可能不会! – lilbiscuit

0

因为表更新日志可以有多个记录每个产品,它会乘以你已有的计数。解决这个

的方法之一,是通过在一个子查询,然后您可以加入到查询的其余部分从更新日志表计数活动记录:

SELECT p.category, 
      SUM(p.price_level='1') as 'Level1', 
      SUM(p.price_level='2') as 'Level2', 
      SUM(p.price_level='3') as 'Level3', 
      COALESCE(c.cnt, 0)  as 'Active' 
FROM  products AS p 
LEFT JOIN (
      SELECT productId, 
        COUNT(*) as cnt 
      FROM  changelog 
      WHERE status = 'Active' 
      GROUP BY productId 
     ) AS c 
     ON c.productId = p.id 
GROUP BY p.category 
ORDER BY COUNT(p.id) DESC 

我也做其他两个更改:

  • SUM(......)代替COUNT(CASE WHEN...END):它利用一个布尔表达式的计算结果为0或1的事实;我认为更清楚,也更短;
  • ORDER BY COUNT(id)而不是ORDER BY COUNT(category):在您分组的字段上应用聚合很奇怪。虽然在MySql中有效,但在标准SQL中它将不被允许。这也是没有必要的;我发现计算id事件更具可读性,即使它具有相同的结果。
  • 我没有使用CASE WHEN子句来过滤活动更新日志记录,因为通过WHERE子句过滤这些记录的效率更高。