2016-04-09 55 views
2

我有一个非常复杂的查询,它在CASE语句中使用了一些子查询。如何防止CASE中的从属子查询x(子查询)

对于这个问题,完整的查询是不需要的,只是防止人们快速进入问题。

所以这篇文章使用伪代码来处理。如果想要我可以发布查询,但它是一个怪物,并没有用于这个问题。

我想要的是CASE语句中的可缓存子查询。

SELECT * FROM posts posts 
INNER JOIN posts_shared_to shared_to 
     ON shared_to.post_id = posts.post_id 
INNER JOIN channels.channels 
     ON channels.channel_id = shared_to.channel_id 
WHERE posts.parent_id IS NULL 
AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE) 
AND CASE(
    WHEN channel.read_access IS NULL THEN 1 
    WHEN channel.read_access = 1 THEN 
    (
     SELECT count(*) FROM channel_users 
     WHERE user_id = XXX AND channel_id = channels.channel_id 
    ) 
    WHEN shared_to.read_type = 2 THEN 
    (
     /* another subquery with a join */ 
     /* check if user is in friendlist of post_author */ 
    ) 
    ELSE 0 
    END; 
) 
GROUP BY post.post_id 
ORDER BY post.post_id 
DESC LIMIT n,n 

如上所述,这只是一个简化的伪代码。

MySql EXPLAIN表示CASE中所有使用的子查询都是依赖的,这意味着(如果我是正确的)他们需要每次运行并且没有被缓存。

任何有助于加快此查询的解决方案都是受欢迎的。

编辑的零件: 现在真正的查询看起来是这样的:

SELECT a.id, a.title, a.message AS post_text, a.type, a.date, a.author AS uid, 
b.a_name as name, b.avatar, 
shared_to.to_circle AS circle_id, shared_to.root_circle, 
c.circle_name, c.read_access, c.owner_uid, c.profile, 
MATCH(a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) AS score 

FROM posts a 

/** get userdetails for post_author **/ 
JOIN authors b ON b.id = a.author 

/** get circles posts was shared to **/ 
JOIN posts_shared_to shared_to ON shared_to.post_id = a.id AND shared_to.deleted IS NULL 

/** 
* get circle_details note: at the moment shared_to can contain NULL and 1 too and doesnt need to be a circle_id 
* if to_circle IS NULL post was shared public 
* if to_circle = 1 post was shared to private circles 
* since we use md5 keys as circle ids this can be a string insetad of (int) ... ugly.. 
* 
**/ 
LEFT JOIN circles c ON c.circle_id = shared_to.to_circle 
    /*AND c.circle_name IS NOT NULL */ 
    AND (c.profile IS NULL OR c.profile = 6 OR c.profile = 1) 
    AND c.deleted IS NULL 

LEFT JOIN (
    /** if post is within a channel that requires membership we use this to check if requesting user is member **/ 
    SELECT COUNT(*) users_count, user_id, circle_id FROM circles_users 
    GROUP BY user_id, circle_id 
    ) counts ON counts.circle_id = shared_to.to_circle 
      AND counts.user_id = :me 

LEFT JOIN (
    /** if post is shared private we check if requesting users exists within post authors private circles **/ 
    SELECT count(*) in_circles_count, ci.owner_uid AS circle_owner, cu1.user_id AS user_me 
    FROM circles ci 
    INNER JOIN circles_users cu1 ON cu1.circle_id = ci.circle_id 
           AND cu1.deleted IS NULL 
    WHERE ci.profile IS NULL AND ci.deleted IS NULL 
    GROUP BY user_me, circle_owner 
) users_in_circles ON users_in_circles.user_me = :me 
        AND users_in_circles.circle_owner = a.id 

/** make sure post is a topic **/ 
WHERE a.parent_id IS NULL AND a.deleted IS NULL 

/** search title and post body **/ 
AND MATCH (a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) 

AND (
    /** own circle **/ 
    c.owner_uid = :me 
    /** site member read_access (this query is for members, for guests we use a different query) **/ 
    OR (c.read_access = 1 OR c.read_access = "1") 
    /** public read_access **/ 
    OR (shared_to.to_circle IS NULL OR (c.read_access IS NULL AND c.owner_uid IS NOT NULL)) 
    /** channel/circle member read_access**/ 
    OR (c.read_access = 3 OR c.read_access = "3" AND counts.users_count > 0) 
    /** for users within post creators private circles **/ 
    OR ( 
    ( 
    /** use shared_to to determine if post is private **/ 
    shared_to.to_circle = "1" OR shared_to.to_circle = 1 
    /** use circle settings to determine global privacy **/ 
    OR (c.owner_uid IS NOT NULL AND c.read_access = 2 OR c.read_access = "2") 
    ) AND users_in_circles.circle_owner = a.author AND users_in_circles.user_me = :me 
    ) 
) 

GROUP BY a.id ORDER BY a.id DESC LIMIT n,n 

问: 这真的是更好的办法?如果我查看派生表可以包含多少行,我不确定。

也许有人可以帮助我改变像通过@奥利 - 琼斯提到的查询:

SELECT stuff, stuff, stuff 
    FROM (
     SELECT post.post_id 
      FROM your whole query 
      ORDER BY post_id DESC 
      LIMIT n,n 
     ) ids 
    JOIN whatever ON whatever.post_id = ids.post_id 
    JOIN whatelse ON whatelse 

Sry基因,如果这个声音slazy,但我不是一个真正的mysqlguy和我头痛多年刚刚从建筑这个查询。 :D

回答

2

消除依赖子查询的最好方法是重构它,以便它是一个虚拟表(独立子查询),然后JOIN或LEFT将它连接到其余表中。

在你的情况,你有

 SELECT count(*) FROM channel_users 
     WHERE user_id = XXX AND channel_id = channels.channel_id 

所以,这个独立的,子查询铸件

    SELECT COUNT(*) users_count, 
          user_id, channel_id 
        FROM channel_users 
        GROUP BY user_id, channel_id 

你看到虚拟表如何包含的user_id每个不同组合一行和channel_id值?每行都有您需要的users_count值。然后,可以将其加入到查询的其余部分,如此。 (请注意,INNER JOIN ===在MySQL JOIN,所以就用JOIN缩短它一下。)

SELECT * FROM posts posts 
    JOIN posts_shared_to shared_to ON shared_to.post_id = posts.post_id 
    JOIN channels.channels ON channels.channel_id = shared_to.channel_id 
    LEFT JOIN (
        SELECT COUNT(*) users_count, 
          user_id, channel_id 
        FROM channel_users 
        GROUP BY user_id, channel_id 
     ) counts ON counts.channel_id = shared_to.channel_id 
       AND counts.user_id = channels.user_id 
    LEFT JOIN ( /* your other refactored subquery */ 
      ) friendcounts ON whatever 
WHERE posts.parent_id IS NULL 
    AND channels.user_id = XXX 
    AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE) 
    AND (   channel.read_access IS NULL 
       OR (channel.read_access = 1 AND counts.users_count > 0) 
       OR (shared_to.read_type = AND friendcount.users_count > 0) 
     ) 
GROUP BY post.post_id 
ORDER BY post.post_id DESC 
LIMIT n,n 

MySQL的查询规划一般是足够聪明来生成每个独立子查询的适当子集。

专业提示:SELECT lots of columns ... ORDER BY something LIMIT n通常被认为是浪费的反模式。它杀死了性能,因为它排序了大量的数据列,然后丢弃了大部分结果。

专业提示:SELECT *在JOIN查询中也是浪费。如果您给出结果集中实际需要的列的列表,则情况会好得多。

所以,你可以再次重构你的查询做

SELECT stuff, stuff, stuff 
     FROM (
      SELECT post.post_id 
       FROM your whole query 
       ORDER BY post_id DESC 
       LIMIT n,n 
      ) ids 
     JOIN whatever ON whatever.post_id = ids.post_id 
     JOIN whatelse ON whatelse. 

的想法是只排序post_id值,然后利用有限的子集拉你需要的数据的其余部分。

+1

我希望我可以声称我的答案中的查询已被调试并可以使用。但我不是个骗子。 –

+1

微笑。看起来天才。谢谢。给我足够的新想法。 – user2429266

+0

好的。我深吸了一口气,完成了你给定的输入。 (这样很难构建原始查询,根本无法构建原始查询。)如果独立子查询在大型表上运行,是否真的更有效? – user2429266