2014-01-31 19 views
0

我正在使用MySQL 5.0,并且需要对此查询进行微调。任何人都可以告诉我,我能做些什么调整?将NOT IN查询转换为更好的性能

SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details 
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
UNION 
SELECT DISTINCT(alert_master_id) FROM alert_sara_header 
WHERE sara_master_id IN 
(SELECT alert_sara_master_id FROM alert_sara_lines 
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
) LIMIT 5000; 
+0

对不起,这丑陋,我不知道如何在这个格式这里。而且急需。 –

回答

4

,我会做的第一件事是rewrite the subqueries as joins

SELECT  h.alert_master_id 

FROM  alert_appln_header h 

     JOIN schedule_config c 
     ON c.schedule_name = 'Purging_Config' 

    LEFT JOIN alert_details d 
     ON d.alert_master_id = h.alert_master_id 
     AND d.end_date IS NULL 
     AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY 

    LEFT JOIN (
       alert_sara_header s 
     JOIN alert_sara_lines l 
      ON l.alert_sara_master_id = s.sara_master_id 
      ) 
     ON s.alert_master_id = h.alert_master_id 
     AND s.end_date IS NULL 
     AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY 

WHERE  h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY 
     AND d.alert_master_id IS NULL 
     AND s.alert_master_id IS NULL 

GROUP BY h.alert_master_id 

LIMIT  5000 

如果它还是后慢,重新审视你的索引策略。我建议在指标:

  • alert_appln_header(alert_master_id,created_date)
  • schedule_config(schedule_name)
  • alert_details(alert_master_id,end_date,created_date)
  • alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
  • alert_sara_lines(alert_sara_master_id)
+0

为连接+1和那很好的重新格式化:) – GameDroids

1

好吧,这可能只是一个黑暗中的镜头,但我认为你不需要这么多DISTINCT这里。

SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
AND alert_master_id NOT IN (
    -- removed distinct here -- 
    SELECT alert_master_id FROM alert_details 
    WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
    UNION 
    -- removed distinct here -- 
    SELECT alert_master_id FROM alert_sara_header 
    WHERE sara_master_id IN 
     (SELECT alert_sara_master_id FROM alert_sara_lines 
     WHERE end_date IS NULL) 
    AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
) LIMIT 5000; 

由于使用DISTINCT是非常昂贵的,请尽量避免它。在第WHERE条款您正在检查idsNOT结果,因此结果某些ids出现不止一次应该没有关系。

+0

谢谢主席先生,第一个不同的是我的错误,但其余两个我做的,以减少子查询的大小,并使IN运算符更快,m不知道如果我在这里是否正确。 –