Press "Enter" to skip to content

Docker 服务无法启动故障排除

docker 服务无法启动是升级docker后经常出现的问题:
# systemctl restart docker
 Job for docker.service failed because the control process exited with error code.
 See "systemctl status docker.service" and "journalctl -xe" for details.

首先使用命令行启动检查启动过程:

sudo /usr/bin/dockerd –containerd=/run/containerd/containerd.sock

sudo dockerd –debug

将会现实docker整个启动过程,按日志现实错误分别处理不同故障。

防火墙相关问题

日志:

failed to start daemon: Error initializing network controller: Error creating default "bridge" network: Failed to program NAT chain: ZONE_CONFLICT: 'docker0' already bound to a zone

原因:

防火墙的区域冲突,导致创建虚拟网络失败

解决办法:

可以看到具体的错误原因(如果遇到其它错误,请具体问题具体分析):

failed to start daemon: Error initializing network controller: Error creating default “bridge” network: Failed to program NAT chain: ZONE_CONFLICT: ‘docker0’ already bound to a zone

删除docker虚拟网卡的安全域状态

firewall-cmd --zone=trusted(或public) --remove-interface=docker0 --permanent 
firewall-cmd --reload

socket文件相关问题

日志:

DEBU[2019-10-10T14:35:29.262315604+08:00] Cleaning up old mountid : start.
failed to start daemon: failed to dial "/run/containerd/containerd.sock": all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused": unavailabl

原因:

/run/containerd/containerd.sock/var/run/containerd/containerd.sock两个socket文件丢失。

解决办法:

连接不到/run/containerd/containerd.sock,docker并不会自动新建。创建socket文件的方式有mksocknc或者用python

  • sudo mksocket /run/containerd/containerd.sock
  • sudo nc -lU /run/containerd/containerd.sock
  • python -c "import socket as s; sock = s.socket(s.AF_UNIX); sock.bind('/run/containerd/containerd.sock')"

继续debug模式启动docker,可能得到了上述同样的报错,可能需要重启服务器解决。

Leave a Reply

Your email address will not be published. Required fields are marked *