2009-11-09 41 views
1

我有以下两个查询:Mysql NOT IN大型结果集的操作员性能问题?

select count(*) 
from  segmentation_cycle_recipients scr 
     , segmentation_instance si 
where si.access_code=scr.access_code 
     and si.segment_id is NOT NULL; 

收益在0.2秒

2)

select count(*) 
from  segmentation_cycle_recipients scr 
     , segmentation_instance si, web_pat_info wpi 
where si.access_code=scr.access_code and scr.siebel_row_id=wpi.siebel_id 
     and si.segment_id is NOT NULL; 

返回4003行的0.48秒

现在13429行,我想1 )-2)所以我写了下面的查询:

select count(*) 
from  segmentation_cycle_recipients scr 
     , segmentation_instance si 
where si.access_code=scr.access_code 
     and si.segment_id is NOT NULL 
     and scr.siebel_row_id NOT IN (select scr.siebel_row_id 
from  segmentation_cycle_recipients scr 
     , segmentation_instance si 
     , web_pat_info wpi where si.access_code=scr.access_code 
     and scr.siebel_row_id=wpi.siebel_id and si.segment_id is NOT NULL); 

我期待13429-4003 = 9426行,但查询需要永久(必须杀死查询命令)才能执行。它甚至在mysql> status中的“慢查询”列表中增加一个计数器;)

它返回<结果集小得多的开发环境中100ms。所以我相信查询本身是正确的。

我相信,使用NOT IN是Mysql中已知的性能问题(Oracle拥有MINUS运算符)。有关如何提高此查询性能的任何建议?

+1

您是否尝试'EXPLAIN'查询? –

+0

请参阅不存在的其他解答。如果MySQL支持'MINUS',我无法记住我的头脑。 – Xailor

回答

2
SELECT COUNT(*) 
FROM segmentation_cycle_recipients scr 
JOIN segmentation_instancs si 
ON  si.access_code = scr.access_code 
LEFT JOIN 
     web_pat_info wpi 
ON  wpi.siebel_id = scr.siebel_row_id 
WHERE wpi.siebel_id IS NULL 
     AND si.segment_id is NOT NULL 

确保si.access_codewpi.siebel_id被编入索引,并wpi.siebel_id定义为NOT NULL

如果后面的条件不成立,则将WHERE子句中的wpi.siebel_id IS NULL替换为定义为NOT NULL的任何其他列。

1

使用NOT EXISTS条款可能会更好地满足您的要求。

select count(*) 
from  segmentation_cycle_recipients scr 
     , segmentation_instance si 
where si.access_code=scr.access_code 
     and si.segment_id is NOT NULL 
     and NOT EXISTS (select scr2.siebel_row_id 
from  segmentation_cycle_recipients scr2 
     , segmentation_instance si2 
     , web_pat_info wpi2 where si2.access_code=scr2.access_code 
     and scr2.siebel_row_id=wpi2.siebel_id and si2.segment_id is NOT NULL 
     and scr.siebel_row_id=scr2.siebel_row_id); 
+0

不存在与不在IN相同的性能问题 – Pritam