2014-03-05 83 views
1

我有一个查询,显示通过我的系统去年发送了多少个消息,按月分组。完美的作品!按2列分组

结果看起来像这样:

+------+-------+--------+--------+--------+ 
| Year | Month | Type 1 | Type 2 | Type 3 | 
+------+-------+--------+--------+--------+ 
| 2013 | 10 |  0 |  2 |  3 | 
| 2013 | 11 |  4 |  21 |  56 | 
| 2013 | 12 |  1 |  10 |  16 | 
| 2014 |  1 |  2 |  10 |  52 | 
| 2014 |  2 |  1 |  62 | 118 | 
+------+-------+--------+--------+--------+ 

(类型1,2和3是简单地不同类型的用户-ignore此)

然而,我想避免所述相同的接收器(msg_receiver)可以在结果集中显示两次,每个月。

因此,如果用户44和39在12月向用户70发送消息,则user_id 70将仅在12月被计数一次。目前,他将出现两次。

下面是我的查询:

SELECT 
    Year(m.msg_date) as year, 
    Month(m.msg_date) as month, 
    sum(u.type = '1') as type_1, 
    Sum(u.type = '2') as type_2, 
    sum(u.type = '7') as type_3 
FROM 
    messages m 
INNER JOIN 
    users u ON u.user_id = m.msg_sender 
WHERE 
    m.msg_date >= CURDATE() - INTERVAL 1 YEAR 
    AND month(msg_date) != month(curdate()) 
GROUP BY 
    Month(m.msg_date) -- , m.msg_receiver (this does not work, it will no longer group by each month/year). 
ORDER BY 
    msg_date 

逻辑答案,就在我的选择是,以第一组由一个月,然后USER_ID(或副通过)。但如果我这样做,结果看起来很奇怪。请参阅:

使用GROUP BY Month(m.msg_date), u.user_id

+------+-------+--------+--------+--------+ 
| Year | Month | Type 1 | Type 2 | Type 3 | 
+------+-------+--------+--------+--------+ 
| 2013 | 10 |  0 |  1 |  0 | 
| 2013 | 10 |  0 |  0 |  1 | 
| 2013 | 10 |  0 |  0 |  1 | 
| 2013 | 10 |  0 |  1 |  0 | 
| 2013 | 10 |  0 |  0 |  1 | 
| 2013 | 11 |  0 |  0 |  19 | 
| 2013 | 11 |  0 |  1 |  0 | 
| 2013 | 11 |  0 |  1 |  0 | 
| 2013 | 11 |  0 |  1 |  0 | 
| 2013 | 11 |  0 |  1 |  0 | 
| 2013 | 11 |  2 |  0 |  0 | 
| 2013 | 11 |  0 |  0 |  11 | 
+------+-------+--------+--------+--------+ 

它没有GROUP BY个月了,因为它应该。

任何想法?

编辑

只是为了澄清我想要什么来实现的,因为人们已经有点糊涂了。想象一下这种情况:

It is December 2013. 

USER 1 has written 5 messages to USER 2 (this should count as 1 in december) 
USER 4 has written 1 message to USER 4 (this should count as 1 in december) 
USER 3 has written 2 messages to USER 4 and 2 (this should count as 2 in december). 

The totals of the month would then be 4. Because there has been 4 conversations.  

它有道理吗?我发现我的自我经常在如何正确表达我的自我和理解方面挣扎。

+0

考虑提供适当的DDL(和/或sqlfiddle)连同所需的结果集 – Strawberry

+1

为了回答这个问题 - 一个问题 - 如果接收方发送多条消息,并且每条消息都是不同的类型,那么您要计算接收方的类型? – user158017

+0

我明白了,对于混乱感到抱歉。我编辑了我的答案来解释 – FooBar

回答

3

您可以使用COUNT(DISTINCT只计算每个msg_receiver每个类型一次:

SELECT 
    Year(m.msg_date) as year, 
    Month(m.msg_date) as month, 
    COUNT(DISTINCT CASE WHEN u.type = '1' THEN m.msg_receiver END) as type_1, 
    COUNT(DISTINCT CASE WHEN u.type = '2' THEN m.msg_receiver END) as type_2, 
    COUNT(DISTINCT CASE WHEN u.type = '3' THEN m.msg_receiver END) as type_3 
FROM 
    messages m 
INNER JOIN 
    users u ON u.user_id = m.msg_sender 
WHERE 
    m.msg_date >= CURDATE() - INTERVAL 1 YEAR 
    AND month(msg_date) != month(curdate()) 
GROUP BY 
    Year(m.msg_date), Month(m.msg_date) 
ORDER BY 
    msg_date 

注:我已经添加Year(m.msg_date)到你的小组,确保到的结果是确定的

如果同一用户接收来自两个不同用户的消息有两种不同的类型,但它们将被计入两种类型。如果这不是预期的结果,您需要拿出一些逻辑来确定它们应该计入哪种类型(最小,最大,模式,中位数等)

例如,如果您想要最小用户类型,你可以使用:

SELECT 
    m.year, 
    m.month, 
    sum(m.type = '1') as type_1, 
    Sum(m.type = '2') as type_2, 
    sum(m.type = '7') as type_3 
FROM ( 
     SELECT 
      Year(m.msg_date) as year, 
      Month(m.msg_date) as month, 
      m.msg_receiver, 
      MIN(u.type) AS type 
     FROM 
      messages m 
     INNER JOIN 
      users u ON u.user_id = m.msg_sender 
     WHERE 
      m.msg_date >= CURDATE() - INTERVAL 1 YEAR 
      AND month(msg_date) != month(curdate()) 
     GROUP BY 
      Year(m.msg_date), Month(m.msg_date), m.msg_receiver 
    ) m 
GROUP BY 
    m.Year, m.Month 
ORDER BY 
    m.year, m.month; 

编辑

针对更新的问题,以目前的形式我的第一个答案会算你的例子,因为只有3谈话不是4,因为当时只有3独一无二收件人。你真正需要的是能够对发送者和接收者进行统计,即count(distinct m.msg_sender, m.msg_sender)。不幸的是,这是非法的语法,但是,你基本上可以通过连接两个字段(只要它们是由不能出现在任何一个字符/字符分隔达到同样的事情。例如

SELECT 
    Year(m.msg_date) as year, 
    Month(m.msg_date) as month, 
    COUNT(DISTINCT CASE WHEN u.type = '1' THEN CONCAT(m.msg_sender, '|', m.msg_receiver) END) as type_1, 
    COUNT(DISTINCT CASE WHEN u.type = '2' THEN CONCAT(m.msg_sender, '|', m.msg_receiver) END) as type_2, 
    COUNT(DISTINCT CASE WHEN u.type = '3' THEN CONCAT(m.msg_sender, '|', m.msg_receiver) END) as type_3 
FROM 
    messages m 
INNER JOIN 
    users u ON u.user_id = m.msg_sender 
WHERE 
    m.msg_date >= CURDATE() - INTERVAL 1 YEAR 
    AND month(msg_date) != month(curdate()) 
GROUP BY 
    Year(m.msg_date), Month(m.msg_date) 
ORDER BY 
    msg_date 
+0

看起来这正是我想要实现的。我会玩一会儿,看看它是否有效。很好的答案。另外,我已经更新了我的答案,以解释我确切要计算的内容。 – FooBar

+0

你是我的英雄,谢谢你。 – FooBar

0

你的天堂”牛逼发布的数据结构,但现在看来,要改变INNER JOIN到

INNER JOIN 
    users u ON u.user_id = m.msg_receiver