k8s集群搭建常见问题汇总
一个想当厨子的码农 Lv2

kubeadm搭建Kubernetes集群问题汇总

问题一:kubeadm init 报错 [ERROR CRI]: container runtime is not running需要使用的解决方法

  1. kubernetes使用crictl命令管理CRI,查看其配置文件/etc/crictl.yaml。初始情况下没有这个配置文件,这里建议添加这个配置,否则kubeadm init时会报其他错。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#containered做进行时
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF
#cri-docker做进行时
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/cri-dockerd.sock
image-endpoint: unix:///var/run/cri-dockerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF

#执行crictl images list命令不报错
crictl images list
  1. 查看配置文件:/etc/containerd/config.toml
1
修改 disabled_plugins = ["cri"]为 disabled_plugins = []

重启containerd

1
systemctl restart containerd

问题二:ERROR FileContent–proc-sys-net-bridge-bridge-nf-call-iptables

报错原因:网桥过滤和地址转发功能不可用

解决方法

1
2
3
4
5
6
7
8
cat > /etc/sysctl.d/kubernetes.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

# 然后执行,生效
sysctl --system

如果执行上述命令报/proc/sys/net/bridge/bridge-nf-call-iptables does not exist错误是因为网桥功能未开启,执行下面的命令即可

1
modprobe br_netfilter

问题三:kubelet报错failed to run Kubelet: running with swap on is not supported

1
swapoff -a

问题四:kubeadm init时报错一些配置文件已存在

解决方法:

1
kubeadm reset

问题五:kubeadm init时,kubelet 报错crictl --runtime-endpoint配置不对

从日志看出时crictl命令运行时有问题。unix:///var/run/containerd/containerd.sock不存在。运行crictl命令,发现同样报错。出现报错的原因是crictl下载镜像时使用的是默认端点[unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]。这些端点废弃了,需要重新指定containerd.sock。后面的报错就是找不到dockershim.sock。

解决方法:修改crictl文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#containered做进行时
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF
#cri-docker做进行时
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/cri-dockerd.sock
image-endpoint: unix:///var/run/cri-dockerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF

#执行crictl images list命令不报错
crictl images list

问题六:报Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future.警告

W0504 10:46:57.238606 6046 initconfiguration.go:120] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme “unix” to the “criSocket” with value “/var/run/cri-dockerd.sock”. Please update your configuration!

如果是contained做进行时就执行:

1
2
3
4
5
6
7
8
9
10
11
sudo crictl config \
--set runtime-endpoint=unix:///var/run/containerd/containerd.sock \
--set image-endpoint=unix:///var/run/containerd/containerd.sock
#或者
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF

如果时cri-docker做进行时就执行:

1
2
3
4
5
6
7
8
9
10
11
sudo crictl config \
--set runtime-endpoint=unix:///var/run/cri-dockerd.sock \
--set image-endpoint=unix:///var/run/cri-dockerd.sock
#或者
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/cri-dockerd.sock
image-endpoint: unix:///var/run/cri-dockerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF

问题七:报错pause镜像获取失败

通过log提示执行命令crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a 发现没有容器在运行。查看containerd的日志,有如下报错:

1
2
3
4
5
journalctl -fu containerd
...
Oct 11 08:35:16 master.k8s containerd[1903]: time="2023-10-11T08:35:16.760026536+08:00" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-apiserver-node,Uid:a5a7c15a42701ab6c9dca630e6523936,Namespace:kube-system,Attempt:0,} failed, error" error="failed to get sandbox image \"registry.k8s.io/pause:3.6\": failed to pull image \"registry.k8s.io/pause:3.6\": failed to pull and unpack image \"registry.k8s.io/pause:3.6\": failed to resolve reference \"registry.k8s.io/pause:3.6\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6\": dial tcp 108.177.125.82:443: connect: connection refused"
Oct 11 08:35:18 master.k8s containerd[1903]: time="2023-10-11T08:35:18.606581001+08:00" level=info msg="trying next host" error="failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6\": dial tcp 108.177.125.82:443: connect: connection refused" host=registry.k8s.io
...

报错显示containerd拉去镜像失败。error="failed to get sandbox image"registry.k8s.io/pause:3.6"

解决方法:修改containered配置

  • 运行containerd config dump > /etc/containerd/config.toml 命令,将当前配置导出到文件,并修改sandbox_image配置。
1
2
3
4
## 修改配置文件/etc/containerd/config.toml, 更改sandbox_image配置
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
  • 重启containerd
1
2
3
4
systemctl restart containerd

#查看containerd当前配置,验证pause镜像是否生效
containerd config dump | grep pause
 评论
评论插件加载失败
正在加载评论插件