2016-11-25 76 views
0

大多数docker命令永远不会结束。我必须用CTRL + C手动中断它们。即使是简单的命令如docker psdocker info也不会回应。Docker命令不再响应

但是,docker helpdocker version仍然有效。

我认为有一个像特定容器的死锁,所以与容器相关的命令将无法完成。

如何处理这种情况?


我的码头版本是1.12.3。我不使用Swarm模式。 docker logs命令不起作用。使用dmesg我可以看到很多的I/O错误的,但我不知道这是否与我的问题有关:

[12898.121287] loop: Write error at byte offset 8882749440, length 4096. 
[12898.122837] loop: Write error at byte offset 8883666944, length 4096. 
[12898.124685] loop: Write error at byte offset 8882814976, length 4096. 
[12898.126459] loop: Write error at byte offset 8883404800, length 4096. 
[12898.128201] loop: Write error at byte offset 8883470336, length 4096. 
[12898.129921] loop: Write error at byte offset 8883535872, length 4096. 
[12898.131774] loop: Write error at byte offset 8883601408, length 4096. 
[12898.133594] loop: Write error at byte offset 8883732480, length 4096. 
[12917.269786] loop: Write error at byte offset 8883798016, length 4096. 
[12917.270331] quiet_error: 632 callbacks suppressed 
[12917.270334] Buffer I/O error on device dm-6, logical block 1313320 
[12917.270540] lost page write due to I/O error on dm-6 
[12917.270543] Buffer I/O error on device dm-6, logical block 1313321 
[12917.270740] lost page write due to I/O error on dm-6 
[12917.270742] Buffer I/O error on device dm-6, logical block 1313322 
[12917.270957] lost page write due to I/O error on dm-6 
[12917.270959] Buffer I/O error on device dm-6, logical block 1313323 
[12917.271177] lost page write due to I/O error on dm-6 
[12917.271179] Buffer I/O error on device dm-6, logical block 1313324 
[12917.271377] lost page write due to I/O error on dm-6 
[12917.271379] Buffer I/O error on device dm-6, logical block 1313325 
[12917.271573] lost page write due to I/O error on dm-6 
[12917.301759] loop: Write error at byte offset 8883863552, length 4096. 
[12917.312038] loop: Write error at byte offset 8883929088, length 4096. 
[12917.312396] Buffer I/O error on device dm-6, logical block 1313328 
[12917.312635] lost page write due to I/O error on dm-6 
[12917.312638] Buffer I/O error on device dm-6, logical block 1313329 
[12917.312867] lost page write due to I/O error on dm-6 
[12917.312869] Buffer I/O error on device dm-6, logical block 1313330 
[12917.313121] lost page write due to I/O error on dm-6 
[12917.313123] Buffer I/O error on device dm-6, logical block 1313331 
[12917.313346] lost page write due to I/O error on dm-6 
[13090.853726] INFO: task kworker/u8:0:17212 blocked for more than 120 seconds. 
[13090.854055] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 

使用命令sudo systemctl status -l docker,下面的消息被打印出来,但是我不能告诉如果他们是相关的:

dockerd[1344]: time="2016-11-24T17:49:01.184874648+01:00" level=warning msg="libcontainerd: container c9f35af1836bf856001ca6156663f713c1217a697e8d2451927c67797fb5a770 restart canceled" 
dockerd[1344]: time="2016-11-24T17:49:02.627116016+01:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]" 
dockerd[1344]: time="2016-11-24T17:49:02.627152661+01:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]" 
dockerd[1344]: time="2016-11-24T18:19:51.472701647+01:00" level=warning msg="libcontainerd: container c9f35af1836bf856001ca6156663f713c1217a697e8d2451927c67797fb5a770 restart canceled" 
dockerd[1344]: time="2016-11-24T18:19:56.712126199+01:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]" 
dockerd[1344]: time="2016-11-24T18:19:56.712159759+01:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]" 
dockerd[1344]: time="2016-11-24T18:34:24.301786606+01:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]" 
dockerd[1344]: time="2016-11-24T18:34:24.302208751+01:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]" 
+2

我们不知道你的容器,所以我们不能帮你。 – 2016-11-25 11:01:25

+0

你能提供关于如何设置docker守护进程的更多细节吗?比如你用1.12.3运行Swarm模式?你正在运行多少个经理?如果只有一个本地,日志说什么? etc. – abronan

+0

@abronan我编辑添加更多信息。我希望这会有所帮助。 – RotS

回答

1

该Docker命令挂钩错误发生后,我删除了一个容器。

守护进程dockerd处于异常状态:停止后(service docker stop)无法启动(sudo service docker start)。

# sudo service docker start 
Redirecting to /bin/systemctl start docker.service 
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details. 

# journalctl -xe 
kernel: device-mapper: ioctl: unable to remove open device docker-253:0-19468577-d6f74dd67f106d6bfa483df4ee534dd9545dc8ca 
... 
systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE 
systemd[1]: Failed to start Docker Application Container Engine. 
systemd[1]: Unit docker.service entered failed state. 
systemd[1]: docker.service failed. 
polkitd[896]: Unregistered Authentication Agent for unix-process:22551:34177094 (system bus name :1.290, object path /org 
ESCESC 
kernel: dev_remove: 41 callbacks suppressed 
kernel: device-mapper: ioctl: unable to remove open device docker-253:0-19468577-fc63401af903e22d05a4518e02504527f0d7883f9d997d7d97fdfe72ba789863 
... 
dockerd[22566]: time="2016-11-28T10:18:09.840268573+01:00" level=fatal msg="Error starting daemon: timeout" 
systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE 
systemd[1]: Failed to start Docker Application Container Engine. 

此外,许多僵尸多克尔过程可以使用ps -eax | grep docker(在“STAT”列中的“Z”的存在下),例如搬运工的代理来观察。

重新启动服务器并重新启动Docker后,僵尸进程消失,Docker命令再次运行。