2015-11-26 135 views
0

我有一个非常慢的MySQL查询,我想优化。如何优化极慢的MySQL查询,使用COUNT DISTINCT

查询需要66.2070秒从包含大约200行的表中返回5个结果。

数据库表存储usersexperiments(A/B测试),goals(网页网址),visits(页面访问)和conversions(点击目标的URL)。 visitconversion表都有一个combination列,用于记录是否访问了页面的版本A或B或转换来自版本A或B.组合以12的形式存储在数据库中。

我在尝试为每个组合的访问次数和转化次数获取用户实验的列表。

对于一些关系我使用组合主键,这确实使联接更加复杂。我怀疑它,但这可能是问题的原因?

如何重写此查询以使其在合理的时间内运行,至少少于一秒?

这里是我的数据库模式:

Database schema diagram

她是我的查询:

SELECT e.id     AS id, 
     e.name    AS name, 
     e.status    AS status, 
     e.created    AS created, 
     Count(DISTINCT v1.id) AS visits1, 
     Count(DISTINCT v2.id) AS visits2, 
     Count(DISTINCT c1.id) AS conversions1, 
     Count(DISTINCT c2.id) AS conversions2 
FROM experiment e 
     LEFT JOIN visit v1 
       ON (v1.experiment_id = e.id 
        AND v1.user_id = e.user_id 
        AND v1.combination = 1) 
     LEFT JOIN visit v2 
       ON (v2.experiment_id = e.id 
        AND v2.user_id = e.user_id 
        AND v2.combination = 2) 
     LEFT JOIN goal g 
       ON (g.experiment_id = e.id 
        AND g.user_id = e.user_id 
        AND g.principal = 1) 
     LEFT JOIN conversion c1 
       ON (c1.experiment_id = e.id 
        AND c1.user_id = e.user_id 
        AND c1.goal_id = g.id 
        AND c1.combination = 1) 
     LEFT JOIN conversion c2 
       ON (c2.experiment_id = e.id 
        AND c2.user_id = e.user_id 
        AND c2.goal_id = g.id 
        AND c2.combination = 2) 
WHERE e.user_id = 25 
GROUP BY e.id 
ORDER BY e.created DESC 
LIMIT 5 

结果表应该是这个样子:

Results table

回答

2

你应该做agg在进行连接之前进行调整,以避免获得较大的中间结果。我认为逻辑是

SELECT e.id, e.name, e.status, e.created, 
     v.visits1, v.visits2, g.conversions1, g.conversions2 
FROM experiment e LEFT JOIN 
    (SELECT experiment_id, user_id, 
      SUM(combination = 1) as visits1, 
      SUM(combination = 2) as visits2 
     FROM visits 
     WHERE combination IN (1, 2) 
     GROUP BY experiment_id, user_id 
    ) v 
    ON v.experiment_id = e.id AND 
     v.user_id = e.user_id LEFT JOIN 
    (SELECT g.experiment_id, g.user_id, 
      SUM(c.combination = 1) as conversions1, 
      SUM(c.combination = 2) as conversions2 
     FROM goal g LEFT JOIN 
      conversion c 
      ON c.experiment_id = g.experiment_id AND 
       c.user_id = g.user_id AND 
       c.goal_id = g.id 
     WHERE g.principal = 1 
     GROUP BY g.experiment_id, g.user_id 
    ) g 
    ON g.experiment_id = e.id AND 
     g.user_id = e.user_id LEFT JOIN 
WHERE e.user_id = 25 
ORDER BY e.created DESC 
LIMIT 5 ; 

这还有进一步的优化。例如,在experiment(user_id, created, id)上的索引。

+0

非常感谢戈登:)这完美的作品,让我做其他计算,如求和访问量和访问量0和计算CONVER锡安比率 – mattvick