2012-08-03 101 views
4

我创造了电台的报表生成在线听众的记录,以IP,日期,时间,总用户的记录监听等MySQL查询优化加入

监听器表

client_ip  date  time  date_time   listeners 
--------------- ---------- -------- ------------------- ----------- 
166.147.81.179 2012-04-30 00:19:46 2012-04-30 00:19:46   1 
64.12.243.203 2012-04-30 04:38:37 2012-04-30 04:38:37   1 
198.228.211.195 2012-04-30 05:36:33 2012-04-30 05:36:33   1 
198.228.211.195 2012-04-30 05:36:34 2012-04-30 05:36:34   2 
198.228.211.195 2012-04-30 05:36:35 2012-04-30 05:36:35   2 
198.228.211.195 2012-04-30 05:36:35 2012-04-30 05:36:35   3 
166.147.81.179 2012-04-30 05:47:13 2012-04-30 05:47:13   2 
76.170.251.97 2012-04-30 06:01:37 2012-04-30 06:01:37   2 
76.170.251.97 2012-04-30 06:01:39 2012-04-30 06:01:39   2 
76.170.251.97 2012-04-30 06:01:39 2012-04-30 06:01:39   2 

在它不断的歌曲详细信息(标题,艺术家,专辑,lenght,日期,时间)等

播放列表表

title      artist       length_in_secs played_date played_time start_date_time  end_date_time   
-------------------------- ------------------------------- -------------- ----------- ----------- ------------------- --------------------- 
We Found Love    Rihanna          184 2012-04-30 00:00:21  2012-04-30 00:00:21 2012-04-30 00:03:25 
Photograph     Nickelback         216 2012-04-30 00:03:31  2012-04-30 00:03:31 2012-04-30 00:07:07 
Not Over You    Gavin DeGraw        214 2012-04-30 00:07:18  2012-04-30 00:07:18 2012-04-30 00:10:52 
Stereo Hearts    Gym Class Heroes Ft Adam Levine    210 2012-04-30 00:10:55  2012-04-30 00:10:55 2012-04-30 00:14:25 
I Gotta Feeling    Black Eyed Peas       243 2012-04-30 00:15:03  2012-04-30 00:15:03 2012-04-30 00:19:06 
One Thing Leads To Another Fixx          182 2012-04-30 00:19:14  2012-04-30 00:19:14 2012-04-30 00:22:16 
Raise Your Glass   Pink          202 2012-04-30 00:22:29  2012-04-30 00:22:29 2012-04-30 00:25:51 
Better In Time    Leona Lewis         216 2012-04-30 00:30:13  2012-04-30 00:30:13 2012-04-30 00:33:49 
Tainted Love    Soft Cell         153 2012-04-30 00:33:56  2012-04-30 00:33:56 2012-04-30 00:36:29 
Haven't Met You Yet   Michael Buble'        242 2012-04-30 00:37:14  2012-04-30 00:37:14 2012-04-30 00:41:16 
记录的同时,

因此,报告要求是“有多少用户在日期或日期范围内听歌”,并且我写这样的查询。它提供了正确的输出(据我所知),但查询执行需要时间与数据大小不成比例 - 从5秒到5-10分钟,这取决于日期范围。

SELECT DATE_FORMAT(p.played_date, "%m/%d/%Y") `played_date`, p.played_time, p.length_in_secs, p.title, p.artist, RTRIM(p.label) `label`, RTRIM(p.album) `album`, IFNULL((SELECT SUM(l.listeners) FROM listeners `l` WHERE l.date_time >= p.start_date_time AND l.date_time <= p.end_date_time LIMIT 1), 0) `listeners` FROM playlists `p` WHERE p.title <> "" AND (p.played_date >= '2012-04-30' AND p.played_date <= '2012-05-30') HAVING listeners > 0 ORDER BY p.title ASC; 
// formatted // 
SELECT 
    DATE_FORMAT(p.played_date, "%m/%d/%Y") `played_date`, 
    p.played_time, 
    p.length_in_secs, 
    p.title, 
    p.artist, 
    RTRIM(p.label) `label`, 
    RTRIM(p.album) `album`, 
    IFNULL(
     (SELECT 
      SUM(l.listeners) 
     FROM 
      listeners `l` 
     WHERE l.date_time >= p.start_date_time 
      AND l.date_time <= p.end_date_time 
     LIMIT 1), 
     0 
    ) `listeners` 
FROM 
    playlists `p` 
WHERE p.title <> "" 
    AND (
     p.played_date >= '2012-04-30' 
     AND p.played_date <= '2012-05-30' 
    ) 
HAVING listeners > 0 
ORDER BY p.title ASC 

输出:

played_date played_time length_in_secs title     artist     label    album    listeners 
----------- ----------- -------------- --------------------- ------------------------ ------------------ ------------------ ----------- 
04/30/2012 08:06:26    228 Brighter Than The Sun Colbie Caillat (Cal-Lay) Universal Republic All of You     9 

04/30/2012 08:44:16    248 Breakfast At Tiffanys Deep Blue Something               6 

04/30/2012 18:06:40    253 Bizarre Love Triangle New Order                 2 

04/30/2012 17:05:21    183 Animal     Neon Trees    Mercury    Habits      5 

04/30/2012 08:58:05    253 Always Be My Baby  Mariah Carey                2 

04/30/2012 07:25:52    264 Already Gone   Kelly Clarkson   RCA     All I Ever Wante    3 

04/30/2012 16:21:33    236 All The Right Moves One Republic    Interscope   Waking Up      7 

04/30/2012 11:58:26    199 All That She Wants  Ace Of Base                12 

04/30/2012 11:14:17    247 All I Wanna Do   Sheryl Crow                 2 

04/30/2012 16:12:59    235 A Thousand Miles  Vanessa Carlton                5 

有没有办法来优化这个查询运行得更快,或者写一个新的,更快的呢?请建议/帮助我。谢谢!!

使用EXPLAIN

EXPLAIN playlists; 


Field   Type    Null Key  Default   Extra       
--------------- ---------------- ------ ------ ----------------- ----------------------------- 
playlist_id  int(10) unsigned NO  PRI  (NULL)    auto_increment    
title   varchar(255)  YES    (NULL)           
artist   varchar(255)  YES    (NULL)           
label   varchar(255)  YES    (NULL)           
album   varchar(255)  YES    (NULL)           
length_in_secs int(11)   NO    (NULL)           
played_date  date    NO    (NULL)           
played_time  time    NO    (NULL)           
start_date_time datetime   NO    (NULL)           
end_date_time datetime   NO    (NULL)           
added_date  datetime   NO    (NULL)           
modified_date timestamp   NO    CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP 


EXPLAIN listeners; 


Field   Type     Null Key  Default   Extra       
------------- ------------------- ------ ------ ----------------- ----------------------------- 
listener_id bigint(20) unsigned NO  PRI  (NULL)    auto_increment    
station_id  int(10) unsigned  NO    (NULL)           
client_ip  varchar(50)   NO    (NULL)           
time   time     NO    (NULL)           
date   date     NO    (NULL)           
date_time  datetime    YES    (NULL)           
timestamp  bigint(20) unsigned NO    (NULL)           
listeners  int(10) unsigned  NO    (NULL)           
processes  int(10) unsigned  NO    (NULL)           
uid   int(10) unsigned  NO    (NULL)           
user_agent  varchar(255)   YES    (NULL)           
added_date  datetime    NO    (NULL)           
modified_date timestamp   NO    CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP 
+0

你可以对其中一个需要更长时间执行的查询运行'EXPLAIN'查询吗?也许问题在于,你正在运行的查询没有适当的索引,创建一个好的索引可以解决时间问题。此外,如果您可以显示该表上当前的索引是非常有用的。谢谢 – drew010 2012-08-03 21:15:30

+3

如何识别用户何时停止收听? – invertedSpear 2012-08-03 21:32:20

+0

@invertedSpear,为什么需要停止/启动,我需要上面的查询优化就是这样。 – 2012-08-04 03:06:10

回答

1

正如在评论中讨论,您的查询实际上并没有做你想要做的事。鉴于您拥有的数据,我将亲自处理SQL以外的这些数据,以创建一张表来存储每首歌曲收听的人数,然后您可以使用SQL查询以获取此信息。但是,如果你绝对想要一个SQL查询来做到这一点,它将需要沿着这个怪物的行列;

SELECT 
DATE_FORMAT(p.played_date, "%m/%d/%Y") `played_date`, 
p.played_time, 
p.length_in_secs, 
p.title, 
p.artist, 
RTRIM(p.label) `label`, 
RTRIM(p.album) `album`, 
SUM(SMALLEST(prev_listeners,next_listeners,dur_listeners) AS listeners 
FROM (
    SELECT 
    P.start_date_time, 
    SUBSTRING_INDEX(GROUP_CONCAT(l_before.listeners ORDER BY l_before.date_time DESC),',',1) AS prev_listeners, 
    SUBSTRING_INDEX(GROUP_CONCAT(l_after.listeners ORDER BY l_after.date_time ASC),',',1) AS next_listeners, 
    MIN(l_during) AS dur_listeners 
    FROM playlists p 
    JOIN listeners l_before ON l_before.date_time < p.start_date_time 
    LEFT JOIN listeners l_after ON l_before.client_ip = l_after.client_ip AND l_after.date_time > p.end_date_time 
    LEFT JOIN listeners l_during ON l.client_ip = l_during.client_ip AND l_during.date_time BETWEEN p.start_date_time AND p.end_date_time 
    WHERE p.title <> "" 
    AND p.played_date BETWEEN '2012-04-30' AND '2012-05-30' 
    GROUP BY p.start_date_time, l_before.client_ip 
) l 
JOIN playlists p USING (start_date_time) 
GROUP BY p.start_date_time 
ORDER BY p.start_date_time 

其中SMALLEST是返回最小的non_null参数的函数。

这将花费比您当前的查询长得多的时间,但这是我可以想到的用于获取每首歌曲的实际听众数量的最有效方式。

哦,这是假设日志记录一个零听众的行,当每个人从一个IP地址停止监听,否则真的没有办法做到这一点。

+0

接受,只是为了逻辑,但没有解决。谢谢!! – 2012-08-09 14:46:22

4

使用INNER JOIN而不是使用correlated subquery为:

SELECT DATE_FORMAT(p.played_date, "%m/%d/%Y") played_date, 
     p.played_time, 
     p.length_in_secs, 
     p.title, 
     p.artist, 
     RTRIM(p.label) label, 
     RTRIM(p.album) album, 
     SUM(l.listeners) listeners 
FROM playlists p 
    INNER JOIN listeners l 
     ON l.date_time BETWEEN p.start_date_time AND p.end_date_time 
WHERE p.title <> "" AND 
     p.played_date BETWEEN '2012-04-30' AND '2012-05-30' 
ORDER BY p.title ASC; 

考虑表上添加以下索引可以帮助你提高查询性能。检查以下指标与EXPLAIN

playlists KEY (played_date, start_date_time, end_date_time, title); 

listeners KEY (date_time, listeners); 
+0

虽然差别在于随着时间的推移而结束,但MySQL通常更快速,而使用'JOIN'而不是'SELECT'子类。 – staticsan 2012-08-06 07:09:17

+0

@Omesh没有得到所需的结果。我粘贴上面,应该怎么输出。 – 2012-08-07 05:39:01

+1

我已经测试过它,它给出了与查询相同的结果。您的输出对于您提供的输入数据看起来不正确。你可以为它设置sqlfiddle吗? – Omesh 2012-08-07 05:49:45