docker 服务无法启动是升级docker后经常出现的问题: # systemctl restart docker Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
首先使用命令行启动检查启动过程:
sudo /usr/bin/dockerd –containerd=/run/containerd/containerd.sock
或
sudo dockerd –debug
将会现实docker整个启动过程,按日志现实错误分别处理不同故障。
防火墙相关问题
日志:
failed to start daemon: Error initializing network controller: Error creating default "bridge" network: Failed to program NAT chain: ZONE_CONFLICT: 'docker0' already bound to a zone
原因:
防火墙的区域冲突,导致创建虚拟网络失败
解决办法:
可以看到具体的错误原因(如果遇到其它错误,请具体问题具体分析):
failed to start daemon: Error initializing network controller: Error creating default “bridge” network: Failed to program NAT chain: ZONE_CONFLICT: ‘docker0’ already bound to a zone
删除docker虚拟网卡的安全域状态
firewall-cmd --zone=trusted(或public) --remove-interface=docker0 --permanent
firewall-cmd --reload
socket文件相关问题
日志:
DEBU[2019-10-10T14:35:29.262315604+08:00] Cleaning up old mountid : start. failed to start daemon: failed to dial "/run/containerd/containerd.sock": all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused": unavailabl
原因:
/run/containerd/containerd.sock
和/var/run/containerd/containerd.sock
两个socket文件丢失。
解决办法:
连接不到/run/containerd/containerd.sock
,docker并不会自动新建。创建socket文件的方式有mksock
、nc
或者用python
sudo mksocket /run/containerd/containerd.sock
sudo nc -lU /run/containerd/containerd.sock
python -c "import socket as s; sock = s.socket(s.AF_UNIX); sock.bind('/run/containerd/containerd.sock')"
继续debug模式启动docker,可能得到了上述同样的报错,可能需要重启服务器解决。