2012-09-25 191 views
3

我有一个查询有一些子查询(内部选择),我试着找出哪个更好的性能,更大的查询或很多较小的查询,当我的服务器上的所有时间都发生变化时,我发现很难尝试和计算差异。MYSQL查询优化,多个查询或一个大型查询

我使用下面的查询一次返回10个结果,以使用分页(偏移和限制)在我的网站上显示。

SELECT adverts.*, breed.breed, breed.type, sellers.profile_name, sellers.logo, users.user_level , 
round(sqrt((((adverts.latitude - '51.558430') * (adverts.latitude - '51.558430')) * 69.1 * 69.1) + ((adverts.longitude - '-0.0069345') * (adverts.longitude - '-0.0069345') * 53 * 53)), 1) as distance, 
(SELECT advert_images.image_name FROM advert_images WHERE advert_images.advert_id = adverts.advert_id AND advert_images.main = 1 LIMIT 1) as imagename, 
(SELECT count(advert_images.advert_id) from advert_images WHERE advert_images.advert_id = adverts.advert_id) AS num_photos 
FROM adverts 
LEFT JOIN breed ON adverts.breed_id = breed.breed_id 
LEFT JOIN sellers ON (adverts.user_id = sellers.user_id) 
LEFT JOIN users ON (adverts.user_id = users.user_id) 
WHERE (adverts.status = 1) AND (adverts.approved = 1) 
AND (adverts.latitude BETWEEN 51.2692837281 AND 51.8475762719) AND (adverts.longitude BETWEEN -0.472015213613 AND 0.458146213613) 
having (distance <= '20') 
ORDER BY distance ASC 
LIMIT 0,10 

它会更好,从主查询中删除低于2个内选择,然后在我的PHP循环,调用2选择10次,一次在回路中的每个记录?

(SELECT advert_images.image_name FROM advert_images WHERE advert_images.advert_id = adverts.advert_id AND advert_images.main = 1 LIMIT 1) as imagename, 
(SELECT count(advert_images.advert_id) from advert_images WHERE advert_images.advert_id = adverts.advert_id) AS num_photos 

回答

1

避免子查询

据我了解你内心的选择,他们有两个目的:找到任何名称相关的图片,并计数相关图像的数量。你可能会实现双方使用左连接,而不是内部的选择:

SELECT …, 
     advert_images.image_name AS imagename, 
     COUNT(advert_images.advert_id) AS num_photos, 
     … 
FROM … 
    LEFT JOIN advert_images ON advert_images.advert_id = adverts.advert_id 
… 
GROUP BY adverts.advert_id 
… 
LIMIT 0,10 

我没有试过,但也许是MySQL的发动机是足够聪明,只进行查询部分的行你”实际上返回。

请注意,根本没有任何保证哪个图像名称此查询将返回给定的一组图像。如果你想得到可重现的结果,你应该在那里使用一些聚合函数,例如MIN(advert_images.image_name)选择字典中的第一个图像。

单独的选择,但没有环

如果上述方法无效,即查询仍然会检查advert_images所有行计算的结果,那么你很可能真的被执行更好第二个查询。然而,你可以尝试避免for循环,而是在一个查询中获取所有这些行:

SELECT advert_images.image_name AS imagename, 
     COUNT(advert_images.advert_id) AS num_photos 
FROM advert_images 
WHERE advert_images.advert_id IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) 
GROUP BY advert_images.advert_id 

这个查询中的十个参数对应于十行您当前生成结果。请注意,广告没有的相关照片将不包含在该结果中。因此,请确保在您的代码中将num_photos设为零,并将imagename设为NULL

临时表

另一种方式来实现你试图做什么是使用一个明确的临时内存表:第一,你选择你感兴趣的结果,然后检索所有相关信息。

CREATE TEMPORARY TABLE tmp 
SELECT adverts.advert_id, round(…) as distance 
FROM adverts 
WHERE (adverts.status = 1) AND (adverts.approved = 1) 
    AND (adverts.latitude BETWEEN 51.2692837281 AND 51.8475762719) 
    AND (adverts.longitude BETWEEN -0.472015213613 AND 0.458146213613) 
HAVING (distance <= 20) 
ORDER BY distance ASC 
LIMIT 0,10; 

SELECT tmp.distance, adverts.*, … 
     advert_images.image_name AS imagename, 
     COUNT(advert_images.advert_id) AS num_photos, 
     … 
FROM tmp 
    INNER JOIN adverts ON tmp.advert_id = adverts.advert_id 
    LEFT JOIN breed ON adverts.breed_id = breed.breed_id 
    LEFT JOIN sellers ON adverts.user_id = sellers.user_id 
    LEFT JOIN users ON adverts.user_id = users.user_id 
    LEFT JOIN advert_images ON advert_images.advert_id = adverts.advert_id 
GROUP BY adverts.advert_id 
ORDER BY tmp.distance ASC; 

DROP TABLE tmp; 

这将确保所有其他表格仅针对您当前正在处理的结果进行查询。毕竟,关于advert_images表几乎没有什么魔力,除了你可能需要多行。

子查询作为从前款的方式加入因子

大厦,你甚至可以避免管理的临时表,代替它使用子查询:

SELECT sub.distance, adverts.*, … 
     advert_images.image_name AS imagename, 
     COUNT(advert_images.advert_id) AS num_photos, 
     … 
FROM (SELECT adverts.advert_id, round(…) as distance 
     FROM adverts 
     WHERE (adverts.status = 1) AND (adverts.approved = 1) 
      AND (adverts.latitude BETWEEN 51.2692837281 AND 51.8475762719) 
      AND (adverts.longitude BETWEEN -0.472015213613 AND 0.458146213613) 
     HAVING (distance <= 20) 
     ORDER BY distance ASC 
     LIMIT 0,10; 
    ) AS sub 
    INNER JOIN adverts ON sub.advert_id = adverts.advert_id 
    LEFT JOIN breed ON adverts.breed_id = breed.breed_id 
    LEFT JOIN sellers ON (adverts.user_id = sellers.user_id) 
    LEFT JOIN users ON (adverts.user_id = users.user_id) 
    LEFT JOIN advert_images ON advert_images.advert_id = adverts.advert_id 
GROUP BY adverts.advert_id 
ORDER BY sub.distance ASC 

同样,你确定相关行仅使用adverts表中的数据,并且仅连接其他表中的必需行。很可能,该中间结果将在内部存储在一个临时表中,但这取决于SQL服务器的决定。

+0

嗨,谢谢你的详细解答。我尝试了你提到的第一种方法,并删除内部选择,但查询比原始文件慢。临时表听起来不错,但是当查询在服务器上每秒钟运行约10次时,它可以正常工作,因为网站非常繁忙? – user1052096

+0

@ user1052096,只要临时表方法的两个查询足够接近,应该没有什么影响。临时表是连接本地的,所以不会有任何名称冲突。与第二个查询结果的许多列相比,“tmp”表的内存消耗应该很小,所以组合的解决方案可能使用的内存少于原始查询。但我只是有另一个想法,我会立刻编辑成我的答案。 – MvG

+0

嗨MvG,感谢您的更新使用子查询哪些工作,但我不知道它的更快。如果我只运行子查询,它会在0.02秒内运行,但是如果我运行子查询而不选择advert_id,它将以0.01的速度运行两次。 – user1052096

0

我认为MySQL使用文件排序+临时表来执行您的查询。这就是为什么在大餐桌上你的建议会带来更好的结果。一般来说,你最好执行较小的查询,然后是1大。

+0

嗨,因为即时通过距离这是一个计算的字段排序,它确实使用文件放慢,当表大。因此,对于主要查询中的表中的每个记录是否会运行2个内部选择?如果是这样,我会认为只是在结果集上运行2个内部选择,会更快。 – user1052096

+0

是内部选择的将在每行上执行 –