2011-10-27 24 views
2

我有这个查询(如下所示),其中当前使用临时文件和文件夹为了生成一组按顺序结果分组。如果可能,我想摆脱他们的使用。我查看了这个查询中使用的底层索引,我只是看不到缺少的内容。使用临时文件优化MySql查询

SELECT 
    b.institutionid AS b__institutionid, 
    b.name AS b__name, 
    COUNT(DISTINCT f2.facebook_id) AS f2__0 
FROM education_institutions b 
LEFT JOIN facebook_education_matches f ON b.institutionid = f.institutionid 
LEFT JOIN facebook_education f2 ON f.school_uid = f2.school_uid 
WHERE 
    (
    b.approved = '1' 
    AND f2.facebook_id IN ([lots of facebook ids here ]) 
) 
GROUP BY b__institutionid 
ORDER BY f2__0 DESC 
LIMIT 10 

这里是EXPLAIN EXTENDED输出:

+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+ 
| id | select_type | table | type | possible_keys     | key   | key_len | ref        | rows | filtered | Extra          | 
+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+ 
| 1 | SIMPLE  | f  | index | PRIMARY,institutionId   | institutionId | 4  | NULL        | 308 | 100.00 | Using index; Using temporary; Using filesort | 
| 1 | SIMPLE  | f2 | ref | facebook_id_idx,school_uid_idx | school_uid_idx | 9  | f.school_uid      | 1 | 100.00 | Using where         | 
| 1 | SIMPLE  | b  | eq_ref | PRIMARY      | PRIMARY  | 4  | f.institutionId     | 1 | 100.00 | Using where         | 
+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+ 

CREATE TABLE语句为每个表如下图所示,让你知道的模式。

CREATE TABLE facebook_education (
    education_id int(11) NOT NULL AUTO_INCREMENT, 
    name varchar(255) DEFAULT NULL, 
    school_uid bigint(20) DEFAULT NULL, 
    school_type varchar(255) DEFAULT NULL, 
    year smallint(6) DEFAULT NULL, 
    facebook_id bigint(20) DEFAULT NULL, 
    degree varchar(255) DEFAULT NULL, 
    PRIMARY KEY (education_id), 
    KEY facebook_id_idx (facebook_id), 
    KEY school_uid_idx (school_uid), 
    CONSTRAINT facebook_education_facebook_id_facebook_user_facebook_id FOREIGN KEY (facebook_id) REFERENCES facebook_user (facebook_id) 
) ENGINE=InnoDB AUTO_INCREMENT=484 DEFAULT CHARSET=utf8; 

CREATE TABLE facebook_education_matches (
    school_uid bigint(20) NOT NULL, 
    institutionId int(11) NOT NULL, 
    created_at timestamp NULL DEFAULT NULL, 
    updated_at timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP, 
    PRIMARY KEY (school_uid), 
    KEY institutionId (institutionId), 
    CONSTRAINT fk_facebook_education FOREIGN KEY (school_uid) REFERENCES facebook_education (school_uid) ON DELETE CASCADE ON UPDATE CASCADE, 
    CONSTRAINT fk_education_institutions FOREIGN KEY (institutionId) REFERENCES education_institutions (institutionId) ON DELETE CASCADE ON UPDATE CASCADE 
) ENGINE=InnoDB DEFAULT; 

CREATE TABLE education_institutions (
    institutionId int(11) NOT NULL AUTO_INCREMENT, 
    name varchar(100) NOT NULL, 
    type enum('School','Degree') DEFAULT NULL, 
    approved tinyint(1) NOT NULL DEFAULT '0', 
    deleted tinyint(1) NOT NULL DEFAULT '0', 
    normalisedName varchar(100) NOT NULL, 
    created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, 
    PRIMARY KEY (institutionId) 
) ENGINE=InnoDB AUTO_INCREMENT=101327 DEFAULT CHARSET=utf8; 

任何指导将不胜感激。

回答

3

的文件排序很可能是因为您对ORDER BY

它在MySQL "ORDER BY Optimization"文档提到没有合适的索引。

你可以做的是加载临时表,然后从中选择。加载临时表时,请使用ORDER BY NULL。当您从临时表中选择时,请使用ORDER BY .. LIMIT

问题是该组添加隐含的order by <group by clause> ASC,除非通过添加order by null来禁用该行为。
这是那些MySQL特定问题之一。

+0

这是当我删除GROUP BY子句,虽然'使用filesort'消失! – GordyD

+0

@GordyD:这就是为什么我给出了这个答案:-) – gbn

+0

更改ORDER BY子句的东西与索引使用filesort没有什么区别,但它。只有当我完全删除GROUP BY时,才会使用任何文件夹。为什么你的回答与此有关呢? – GordyD

0

我可以看到两个可能的优化,

  1. b.approved =“1” - 你绝对需要快速过滤核准列的索引。

  2. f2.facebook_id IN([这里有很多facebook的ids]) - 将facebook的id存储在临时表中。然后在临时表上创建一个索引,然后与临时表连接,而不是使用IN子句。

+0

'1.b.approved ='1' - 您肯定需要在批准列上进行快速筛选的索引.'因为批准是布尔字段且基数较低,MySQL将拒绝使用索引。 – Johan