为什么UPDATE JOIN查询比SELECT JOIN查询慢得多？

这是一场长达19小时的恶梦。为什么UPDATE JOIN查询比SELECT JOIN查询慢得多？

我有一个非常大的查询，本质上需要跨几个表连接大型数据集。在进行连接之后，我需要使用select语句中的数据更新原始表。 SELECT语句超快，UPDATE语句超慢。

这是select语句。

SELECT l.col1, 
     l.col2, 
     l.col3, 
     p.personid 
FROM table1 p 
LEFT JOIN table2 l ON (l.col1 = p.col1) 
LEFT JOIN 
    (SELECT name, 
      col AS 'col2' 
    FROM tbl3 f 
    WHERE f.col LIKE '%-F') pcf ON (pcf.col1 = p.col1) 
LEFT JOIN 
    (SELECT name, 
      col AS 'col3' 
    FROM tbl4 f 
    WHERE f.col LIKE '%-M') pcm ON (pcm.col1 = p.col1) 
WHERE p.requestid = '1928'

现在，如果我将EXACT SAME系列的JOIN放入UPDATE上下文中，则查询需要永久。

UPDATE table1 p 
LEFT JOIN table2 l ON (l.col1 = p.col1) 
LEFT JOIN 
    (SELECT name, 
      col AS 'col2' 
    FROM tbl3 f 
    WHERE f.col LIKE '%-F') pcf ON (pcf.col1 = p.col1) 
LEFT JOIN 
    (SELECT name, 
      col AS 'col3' 
    FROM tbl4 f 
    WHERE f.col LIKE '%-M') pcm ON (pcm.col1 = p.col1) 
SET p.col1 = l.col1, 
    p.col2 = l.col2, 
    p.col3 = l.col3 
WHERE p.requestid = '1928'

因此......为什么UPDATE JOIN语句比SELECT JOIN语句花费的时间要长得多？时间更长。而且我已经尝试过临时表并且没有工作。

仅供参考，我正在处理50k条记录或更多的表格。

如果你好奇的EXPLAIN的结果，这是当我解释了选择查询会发生什么（尽管显然你不能使用更新的解释？）

id select_type table type possible_keys key key_len ref rows Extra 
1 PRIMARY p ALL NULL NULL NULL NULL 613246 Using where 
1 PRIMARY l eq_ref PRIMARY,name_3,name,name_2 PRIMARY 257 drilldev_db.p.lastname 1 
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 23435 
1 PRIMARY <derived3> ALL NULL NULL NULL NULL 13610 
1 PRIMARY <derived4> ALL NULL NULL NULL NULL 13053 
1 PRIMARY <derived5> ALL NULL NULL NULL NULL 8273  
1 PRIMARY <derived6> ALL NULL NULL NULL NULL 11481 
1 PRIMARY <derived7> ALL NULL NULL NULL NULL 6708  
1 PRIMARY <derived8> ALL NULL NULL NULL NULL 9588  
1 PRIMARY <derived9> ALL NULL NULL NULL NULL 5494  
1 PRIMARY <derived10> ALL NULL NULL NULL NULL 6981  
1 PRIMARY <derived11> ALL NULL NULL NULL NULL 4107  
1 PRIMARY <derived12> ALL NULL NULL NULL NULL 5973  
1 PRIMARY <derived13> ALL NULL NULL NULL NULL 3851  
1 PRIMARY <derived14> ALL NULL NULL NULL NULL 4935  
1 PRIMARY <derived15> ALL NULL NULL NULL NULL 3574  
1 PRIMARY <derived16> ALL NULL NULL NULL NULL 5793  
1 PRIMARY <derived17> ALL NULL NULL NULL NULL 4706  
17 DERIVED f ref year,gender gender 257  364263 Using where; Using temporary; Using filesort 
16 DERIVED f ref year,gender gender 257  397322 Using where; Using temporary; Using filesort 
15 DERIVED f range year,gender year 4 NULL 54092 Using where; Using temporary; Using filesort 
14 DERIVED f range year,gender year 4 NULL 54092 Using where; Using temporary; Using filesort 
13 DERIVED f range year,gender year 4 NULL 62494 Using where; Using temporary; Using filesort 
12 DERIVED f range year,gender year 4 NULL 62494 Using where; Using temporary; Using filesort 
11 DERIVED f range year,gender year 4 NULL 69317 Using where; Using temporary; Using filesort 
10 DERIVED f range year,gender year 4 NULL 69317 Using where; Using temporary; Using filesort 
9 DERIVED f ref year,gender gender 257  364263 Using where; Using temporary; Using filesort 
8 DERIVED f range year,gender year 4 NULL 94949 Using where; Using temporary; Using filesort 
7 DERIVED f ref year,gender gender 257  364263 Using where; Using temporary; Using filesort 
6 DERIVED f ref year,gender gender 257  397322 Using where; Using temporary; Using filesort 
5 DERIVED f ref year,gender gender 257  364263 Using where; Using temporary; Using filesort 
4 DERIVED f ref year,gender gender 257  397322 Using where; Using temporary; Using filesort 
3 DERIVED f ALL NULL NULL NULL NULL 37045 Using where 
2 DERIVED f ALL NULL NULL NULL NULL 37045 Using where

谢谢！

-b

来源

2013-08-29 Brian Mayer

你可以使用命令“explain”并发布输出 – jcho360

你如何测量“select”的“超快速”？所有结果的第一个结果还是时间到了？你可以通过添加'按col1限制1'命令来检查需要多长时间（所有结果都需要为'order by'生成）。 –

为什么你有所有这些JOIN？由于它们是LEFT JOIN，它们不会限制原始表中的匹配行。 – Barmar

让我们考虑一下。如果您正在选择表格的行（只是抓住它们）而不是更新每个......单行......当您浏览它们时，需要更长的时间？读取n行或修改（更新）n行数？

将它与阅读一本书的10行对比在一张纸上写同样的10行。哪一个需要更长时间？

我可以补充一点，你阅读的行数与更新数量越多，差异越大。就像书本的阅读和写作线条会有更多差异一样，阅读/写作的线条越多。

来源

2013-08-29 18:30:20 Tricky12

我明白为什么需要更长的时间。我想我更感兴趣的是找出为什么它需要*所以*更长（如果我做的东西效率低下，我可以修复）。谢谢。 –

如果这是你的实际陈述，你不需要第二和第三个左联接，因为他们不会改变结果。

顺便说一句，MySQL不知道如何有效地处理“复杂”查询:-) 如果您在临时表中实现SELECT的结果并使用它，那么它会快得多。

来源

2013-08-29 19:32:22 dnoeth

我确实有一个临时表版本，我认为它的工作速度会更快，但它仍然不足:(。UPDATE查询只是超级密集型的 –

然后你应该显示表DDLs，肯定有错误。通过SELECT？ – dnoeth

什么是表DDL？SELECT返回与我试图匹配的原始表相同的行数（5k当它开始中断时） –

为什么UPDATE JOIN查询比SELECT JOIN查询慢得多？

回答

相关问题