2015-09-27 149 views
11

我已经在这里阅读了几篇文章,但仍不明白我该如何解决错误。另外我如何解码show engine innodb status输出。我怀疑它是一个死锁或者一个永远不会提交/回滚的事务,导致永远持有这个锁(这可能吗?)。如何解决和调试MySQL错误:ER_LOCK_WAIT_TIMEOUT:超出锁定等待超时;尝试重新启动交易

# show engine innodb status; 
===================================== 
2015-09-27 04:43:54 2b9cf470f700 INNODB MONITOR OUTPUT 
===================================== 
Per second averages calculated from the last 33 seconds 
----------------- 
BACKGROUND THREAD 
----------------- 
srv_master_thread loops: 16140 srv_active, 0 srv_shutdown, 3941554 srv_idle 
srv_master_thread log flush and writes: 3957687 
---------- 
SEMAPHORES 
---------- 
OS WAIT ARRAY INFO: reservation count 25915 
OS WAIT ARRAY INFO: signal count 25874 
Mutex spin waits 1675, rounds 50212, OS waits 1581 
RW-shared spins 24307, rounds 729240, OS waits 24278 
RW-excl spins 16, rounds 1680, OS waits 56 
Spin rounds per wait: 29.98 mutex, 30.00 RW-shared, 105.00 RW-excl 
------------ 
TRANSACTIONS 
------------ 
Trx id counter 712274 
Purge done for trx's n:o < 712178 undo n:o < 0 state: running but idle 
History list length 1593 
LIST OF TRANSACTIONS FOR EACH SESSION: 
---TRANSACTION 0, not started 
MySQL thread id 18286, OS thread handle 0x2b9cf470f700, query id 2307113 bb116-14-164-108.singnet.com.sg 116.14.164.108 meclub2359 init 
show engine innodb status 
---TRANSACTION 0, not started 
MySQL thread id 18285, OS thread handle 0x2b9cf4444700, query id 2306954 bb116-14-164-108.singnet.com.sg 116.14.164.108 meclub2359 cleaning up 
---TRANSACTION 712266, not started 
MySQL thread id 18270, OS thread handle 0x2b9cf43c2700, query id 2307088 172.31.8.174 meclub2359 cleaning up 
---TRANSACTION 712235, not started 
MySQL thread id 18268, OS thread handle 0x2b9cf3896700, query id 2306757 172.31.8.174 meclub2359 cleaning up 
---TRANSACTION 712268, not started 
MySQL thread id 18267, OS thread handle 0x2b9cf4381700, query id 2307091 172.31.8.174 meclub2359 cleaning up 
---TRANSACTION 712272, not started 
MySQL thread id 5214, OS thread handle 0x2b9cf31dd700, query id 2307112 localhost 127.0.0.1 rdsadmin cleaning up 
---TRANSACTION 712269, ACTIVE 31 sec starting index read 
mysql tables in use 1, locked 1 
LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s) 
MySQL thread id 18288, OS thread handle 0x2b9cf44c6700, query id 2307092 172.31.8.174 meclub2359 updating 
UPDATE stardate SET winnersSelected=1 WHERE id='6' 
------- TRX HAS BEEN WAITING 31 SEC FOR THIS LOCK TO BE GRANTED: 
RECORD LOCKS space id 4067 page no 3 n bits 72 index `PRIMARY` of table `ebdb`.`stardate` trx id 712269 lock_mode X locks rec but not gap waiting 
Record lock, heap no 3 PHYSICAL RECORD: n_fields 16; compact format; info bits 0 
0: len 4; hex 00000006; asc  ;; 
1: len 6; hex 0000000addcd; asc  ;; 
2: len 7; hex 180000021d2d57; asc  -W;; 
3: len 30; hex 68747470733a2f2f73332d61702d736f757468656173742d312e616d617a; asc https://s3-ap-southeast-1.amaz; (total 62 bytes); 
4: len 30; hex 4d75736963616c204576656e696e67207769746820456c6c696520476f75; asc Musical Evening with Ellie Gou; (total 35 bytes); 
5: len 30; hex 4d75736963616c204576656e696e67207769746820456c6c696520476f75; asc Musical Evening with Ellie Gou; (total 35 bytes); 
6: len 30; hex 4d75736963616c204576656e696e67207769746820456c6c696520476f75; asc Musical Evening with Ellie Gou; (total 255 bytes); 
7: len 22; hex 53696e6761706f72650d0a4d6172696e612042617920; asc Singapore Marina Bay ;; 
8: len 20; hex 52443230313530394d434d4543444153434d5346; asc RD201509MCMECDASCMSF;; 
9: len 4; hex 80000002; asc  ;; 
10: len 5; hex 9997295000; asc )P ;; 
11: len 5; hex 999702a000; asc  ;; 
12: len 4; hex 8000000c; asc  ;; 
13: len 1; hex 81; asc ;; 
14: len 5; hex 999706e345; asc  E;; 
15: len 5; hex 99972ee820; asc . ;; 

------------------ 
---TRANSACTION 712213, ACTIVE 501 sec 
5 lock struct(s), heap size 1184, 1 row lock(s), undo log entries 3 
MySQL thread id 18284, OS thread handle 0x2b9cf3daa700, query id 2307007 172.31.8.174 meclub2359 cleaning up 
Trx read view will not see trx with id >= 712249, sees < 712141 
---TRANSACTION 712197, ACTIVE 574 sec 
5 lock struct(s), heap size 1184, 1 row lock(s), undo log entries 3 
MySQL thread id 18283, OS thread handle 0x2b9cf4403700, query id 2306959 172.31.8.174 meclub2359 cleaning up 
Trx read view will not see trx with id >= 712241, sees < 712141 
---TRANSACTION 712184, ACTIVE 595 sec 
5 lock struct(s), heap size 1184, 1 row lock(s), undo log entries 3 
MySQL thread id 18281, OS thread handle 0x2b9cf468d700, query id 2306856 172.31.8.174 meclub2359 cleaning up 
Trx read view will not see trx with id >= 712240, sees < 712141 
---TRANSACTION 712141, ACTIVE 935 sec 
7 lock struct(s), heap size 1184, 12 row lock(s), undo log entries 19 
MySQL thread id 18279, OS thread handle 0x2b9cf4485700, query id 2306383 172.31.8.174 meclub2359 cleaning up 
Trx read view will not see trx with id >= 712146, sees < 712146 
-------- 
FILE I/O 
-------- 
I/O thread 0 state: waiting for completed aio requests (insert buffer thread) 
I/O thread 1 state: waiting for completed aio requests (log thread) 
I/O thread 2 state: waiting for completed aio requests (read thread) 
I/O thread 3 state: waiting for completed aio requests (read thread) 
I/O thread 4 state: waiting for completed aio requests (read thread) 
I/O thread 5 state: waiting for completed aio requests (read thread) 
I/O thread 6 state: waiting for completed aio requests (write thread) 
I/O thread 7 state: waiting for completed aio requests (write thread) 
I/O thread 8 state: waiting for completed aio requests (write thread) 
I/O thread 9 state: waiting for completed aio requests (write thread) 
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] , 
ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 
Pending flushes (fsync) log: 0; buffer pool: 0 
562 OS file reads, 295536 OS file writes, 202242 OS fsyncs 
0.00 reads/s, 0 avg bytes/read, 0.24 writes/s, 0.18 fsyncs/s 
------------------------------------- 
INSERT BUFFER AND ADAPTIVE HASH INDEX 
------------------------------------- 
Ibuf: size 1, free list len 0, seg size 2, 0 merges 
merged operations: 
insert 0, delete mark 0, delete 0 
discarded operations: 
insert 0, delete mark 0, delete 0 
Hash table size 1241027, node heap has 2 buffer(s) 
0.06 hash searches/s, 0.27 non-hash searches/s 
--- 
LOG 
--- 
Log sequence number 87135734 
Log flushed up to 87135734 
Pages flushed up to 87135734 
Last checkpoint at 87135734 
0 pending log writes, 0 pending chkp writes 
133514 log i/o's done, 0.09 log i/o's/second 
---------------------- 
BUFFER POOL AND MEMORY 
---------------------- 
Total memory allocated 641744896; in additional pool allocated 0 
Dictionary memory allocated 264646 
Buffer pool size 38271 
Free buffers  20250 
Database pages  18019 
Old database pages 6631 
Modified db pages 0 
Pending reads 0 
Pending writes: LRU 0, flush list 0, single page 0 
Pages made young 423, not young 82 
0.00 youngs/s, 0.00 non-youngs/s 
Pages read 521, created 17502, written 131006 
0.00 reads/s, 0.00 creates/s, 0.12 writes/s 
Buffer pool hit rate 1000/1000, young-making rate 0/1000 not 0/1000 
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s 
LRU len: 18019, unzip_LRU len: 0 
I/O sum[4]:cur[0], unzip sum[0]:cur[0] 
-------------- 
ROW OPERATIONS 
-------------- 
0 queries inside InnoDB, 0 queries in queue 
4 read views open inside InnoDB 
Main thread process no. 14465, id 47953069967104, state: sleeping 
Number of rows inserted 55739, updated 14635, deleted 309, read 1463461 
0.00 inserts/s, 0.03 updates/s, 0.00 deletes/s, 35.00 reads/s 
---------------------------- 
END OF INNODB MONITOR OUTPUT 
============================ 

它看起来好像有一些疑问挂?的show processlist

enter image description here

如果让我怎么调试更加凡在我的应用程序会导致这个输出?我正在使用AWS RDS,Elastic Beanstalk,NodeJS/SailsJS。

UPDATE

我注意到这个问题只发生在一排。当我尝试UPDATE stardate SET winnersSelected=1 WHERE id='6'

UPDATE

我发现,在代码的改变关闭连接工程。但仍然好奇,我怎么从上面的输出中看出原因是什么?我认为这会很有用。我不确定这是否是原因。

例如,我看到

---TRANSACTION 0, not started 
MySQL thread id 18285, OS thread handle 0x2b9cf4444700, query id 2306954 bb116-14-164-108.singnet.com.sg 116.14.164.108 meclub2359 cleaning up 

---TRANSACTION 712235, not started 
MySQL thread id 18268, OS thread handle 0x2b9cf3896700, query id 2306757 172.31.8.174 meclub2359 cleaning up 

为什么它似乎像清理走这么长时间?我应该担心吗?

---TRANSACTION 712269, ACTIVE 31 sec starting index read 
mysql tables in use 1, locked 1 
LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s) 
MySQL thread id 18288, OS thread handle 0x2b9cf44c6700, query id 2307092 172.31.8.174 meclub2359 updating 
UPDATE stardate SET winnersSelected=1 WHERE id='6' 
------- TRX HAS BEEN WAITING 31 SEC FOR THIS LOCK TO BE GRANTED: 
RECORD LOCKS space id 4067 page no 3 n bits 72 index `PRIMARY` of table `ebdb`.`stardate` trx id 712269 lock_mode X locks rec but not gap waiting 
Record lock, heap no 3 PHYSICAL RECORD: n_fields 16; compact format; info bits 0 
0: len 4; hex 00000006; asc  ;; 
1: len 6; hex 0000000addcd; asc  ;; 
2: len 7; hex 180000021d2d57; asc  -W;; 
3: len 30; hex 68747470733a2f2f73332d61702d736f757468656173742d312e616d617a; asc https://s3-ap-southeast-1.amaz; (total 62 bytes); 
4: len 30; hex 4d75736963616c204576656e696e67207769746820456c6c696520476f75; asc Musical Evening with Ellie Gou; (total 35 bytes); 
5: len 30; hex 4d75736963616c204576656e696e67207769746820456c6c696520476f75; asc Musical Evening with Ellie Gou; (total 35 bytes); 
6: len 30; hex 4d75736963616c204576656e696e67207769746820456c6c696520476f75; asc Musical Evening with Ellie Gou; (total 255 bytes); 
7: len 22; hex 53696e6761706f72650d0a4d6172696e612042617920; asc Singapore Marina Bay ;; 
8: len 20; hex 52443230313530394d434d4543444153434d5346; asc RD201509MCMECDASCMSF;; 
9: len 4; hex 80000002; asc  ;; 
10: len 5; hex 9997295000; asc )P ;; 
11: len 5; hex 999702a000; asc  ;; 
12: len 4; hex 8000000c; asc  ;; 
13: len 1; hex 81; asc ;; 
14: len 5; hex 999706e345; asc  E;; 
15: len 5; hex 99972ee820; asc . ;; 

我可以从上面的输出说,查询UPDATE stardate SET winnersSelected=1 WHERE id='6'被阻止?但是我怎么能说出来?

+0

您可以尝试增加锁定等待超时时间。 –

+0

@代码僧侣,我真的不认为超时有帮助,它似乎更像是其他进程睡觉的锁吗?但我不确定,因为我真的不明白如何解读这些信息。 –

+1

您可以设置较低的'wait_timeout'以便关闭睡眠过程连接。这个变量的默认值是8个小时。 –

回答

1

当处理困难的MySQL诊断问题时,通常你必须求助于gdb。你可以做这样的事情:

gdb -ex "set pagination 0" -ex "thread apply all bt" \ 
    --batch -p $(pidof mysqld) 

然后你挖通过堆栈跟踪的列表,并与源的知识一点点你能弄清楚到底是怎么回事。

长时间“清理”的可能原因是某处被另一个线程占用的某个关键锁,并且可能有多种原因。如果你粘贴堆栈跟踪,我将能够让你知道发生了什么。

InnoDB中的死锁实际上是一种合法的情况。它的设计方式是,在某些情况下,锁定算法发现自己​​死锁,但很少见。该问题通过锁定等待超时进行缓解 - 达到时,会中止事务以避免死锁。理论上讲,应该编写理想的InnoDB客户端代码,以在所有事务由于死锁而失败时重试所有事务。在实践中,重构一个庞大的代码库以符合这个原则可能非常困难,但我仍然建议选择一些关键的战斗并重构最麻烦的代码以符合。

为避免死锁,请经常提交您的事务,并避免以不可预知的顺序更新记录。当然,这将包括适当的连接清理 - 正如你在案例中所观察到的那样。如果不关闭连接,你可以有一个看似天真的情景是这样的:

  • BEGIN
  • 选择
  • 没有活动时间长

的问题是,该交易使用SELECT锁定已读取的行,并且不会解锁它们,直到COMMIT/ROLLBACK。

如果您有很多难以清理的混乱代码,您可能需要考虑放弃使用连接池。 MySQL具有非常低的会话启动开销,并且可以很好地处理频繁的重新连接。这种非活动连接可能会在应用程序请求自动终止时关闭(我假定有些代码在大量应用程序中常见),并且您将不会与正在返回到在该状态下的连接池。

相关问题