2011-06-03 63 views
4

我无法想清楚的那一刻,我想station_id返回计数,输出的一个例子是:如何通过组与SQL子查询

站1有3个FB后,6个LinkedIn帖子,5个电子邮件帖子 站2有3个FB后,6级LinkedIn的职位,5个电子邮件帖子

所以我需要组由台ID,我的表结构

CREATE TABLE IF NOT EXISTS `posts` (
    `post_id` bigint(11) NOT NULL auto_increment, 
    `station_id` varchar(25) NOT NULL, 
    `user_id` varchar(25) NOT NULL, 
    `dated` datetime NOT NULL, 
    `type` enum('fb','linkedin','email') NOT NULL, 
    PRIMARY KEY (`post_id`) 
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=x ; 

查询我至今当它有一个(2分贝t浩)

SELECT Station_id, (select count(*) FROM posts WHERE type = 'linkedin') AS linkedin_count, (select count(*) FROM posts WHERE type = 'fb') AS fb_count, (select count(*) FROM posts WHERE type = 'email') AS email_count FROM `posts` GROUP BY station_id; 

回答

14

或者,以最快的方式,避免了连接和子查询得到它的确切格式你想:

SELECT 
    station_id, 
    SUM(CASE WHEN type = 'linkedin' THEN 1 ELSE 0 END) AS 'linkedin', 
    SUM(CASE WHEN type = 'fb'  THEN 1 ELSE 0 END) AS 'fb', 
    SUM(CASE WHEN type = 'email' THEN 1 ELSE 0 END) AS 'email' 
FROM posts 
GROUP BY station_id; 

输出:

+------------+----------+------+-------+ 
| station_id | linkedin | fb | email | 
+------------+----------+------+-------+ 
| 1   |  3 | 2 |  5 | 
| 2   |  2 | 0 |  1 | 
+------------+----------+------+-------+ 

你可能还想在那里放一个索引来加速它

ALTER TABLE posts ADD INDEX (station_id, type); 

Explain输出:

+----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+ 
| id | select_type | table | type | possible_keys | key  | key_len | ref | rows | Extra  | 
+----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+ 
| 1 | SIMPLE  | posts | index | NULL   | station_id | 28  | NULL | 13 | Using index | 
+----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+ 
+0

此版本的查询还将显示station_id 2的fb的零计数,您的上一个未显示该计数。也是它的优雅:) – cairnz 2011-06-03 10:15:56

0

试试这个:

SELECT p.Station_id, 
(select count(*) FROM posts WHERE type = 'linkedin' and station_id=p.station_id) AS linkedin_count, 
(select count(*) FROM posts WHERE type = 'fb' and station_id=p.station_id) AS fb_count, 
(select count(*) FROM posts WHERE type = 'email' and station_id=p.station_id) AS email_count 
FROM `posts` p GROUP BY station_id 
+0

PERFECTO,欢呼声 – 2011-06-03 10:17:27

+1

匹配这个查询是很慢的,因为它实际上是运行4个查询 – Geoffrey 2011-06-03 10:21:21

+0

随着你的建议INDEX,(station_id,type),这个不应该太慢。可能执行的计划将扫描该索引,并且对于每个station_id“知道”,相关子查询的答案可以在被扫描索引的部分中找到。但是,如果没有索引,这对于大型数据集可能会很糟糕。 – MatBailie 2011-06-03 10:25:13

1

给这个一展身手:

SELECT station_id, type, count(*) FROM posts GROUP BY station_id, type 

输出格式将是你的企图变得有点不同,但它应该提供您尝试检索的统计信息。另外,因为它的单个查询速度更快。

- 编辑,添加例如结果集

+------------+----------+----------+ 
| station_id | type  | count(*) | 
+------------+----------+----------+ 
| 1   | fb  |  2 | 
| 1   | linkedin |  3 | 
| 1   | email |  5 | 
| 2   | linkedin |  2 | 
| 2   | email |  1 | 
+------------+----------+----------+ 
2

正如所暗示的gnif的回答,有三个相关sub_queries有过头的性能。根据您使用的DBMS,它可以执行类似于自我连接三次的操作。

gnif的方法确保该表只解析一次,而无需连接的需要,相关sub_queries等

的立即下侧gnif的回答很明显的是,你永远不得到记录0的。如果没有fb类型,你只是不会得到一个记录。如果这不是问题,我会回答他的答案。如果它是一个问题,但是,这里是类似的方法一个版本gnif,但你的输出格式...

SELECT 
    station_id, 
    SUM(CASE WHEN type = 'linkedin' THEN 1 ELSE 0 END) AS linkedin_count, 
    SUM(CASE WHEN type = 'fb'  THEN 1 ELSE 0 END) AS fb_count, 
    SUM(CASE WHEN type = 'email' THEN 1 ELSE 0 END) AS email_count 
FROM 
    posts 
GROUP BY 
    station_id 
+0

看起来像我击败你这个解决方案:),我已经发布了第二个答案。 – Geoffrey 2011-06-03 10:19:17

+0

Doh,你键入的速度很快:)我会+1你的答案:) – MatBailie 2011-06-03 10:21:06

+0

谢谢:),我也为你+1了:) – Geoffrey 2011-06-03 10:22:06