Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet启动后不断重启,CSR请求部分失败 #630

Open
koktlzz opened this issue Jun 23, 2021 · 1 comment
Open

kubelet启动后不断重启,CSR请求部分失败 #630

koktlzz opened this issue Jun 23, 2021 · 1 comment

Comments

@koktlzz
Copy link

koktlzz commented Jun 23, 2021

文档版本

  • Kubernetes v1.16.6
  • Containerd v1.3.3,不使用 Docker

现象描述

06-4 前的步骤均已完成,且按照作者的方法检查没有问题。然而 kubelet 在启动后发送退出并重启,报错信息如下:

failed to create kubelet: unknown service runtime.v1alpha2.RuntimeService

另外,查看节点的 CSR 情况时发现,system:bootstrappers 组的请求并没有被处理,只有 system:nodes 的请求:

[root@gwr-k8s-01 work]# kubectl get csr
NAME        AGE   REQUESTOR                CONDITION
csr-49c2n   10s   system:node:gwr-k8s-01   Pending
csr-5ht6l   9s    system:node:gwr-k8s-02   Pending
csr-jtx5l   8s    system:node:gwr-k8s-03   Pending

尝试的解决方案

按照作者的步骤安装 Containerd 和 crictl 时,使用 crictl 命令也会报类似的错:

FATA[0000] listing images failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService

在网上找到了相关的解决方案,如 修改配置文件中的 snapshotter,以及 重置配置文件,确实解决了 crictl 的报错。然而 kubelet 却依然因该错误退出重启。另外在启动前也的确创建了相应的rolebinding:

[root@gwr-k8s-01 work]# kubectl get clusterrolebinding | head
NAME                                                   AGE
auto-approve-csrs-for-group                            7d
cluster-admin                                          7d
kube-apiserver:kubelet-apis                            7d
kubelet-bootstrap                                      7d
node-client-cert-renewal                               7d
node-server-cert-renewal                               7d
system:basic-user                                      7d
system:controller:attachdetach-controller              7d
system:controller:certificate-controller               7d

另外查看Containerd的日志,发现缺少cni配置。不过我认为是正常的,毕竟确实还没有安装cni插件:

level=error msg="Failed to load cni configuration" error="cni co...i config"

@lujf0910
Copy link

1、检查你的当前节点(worker节点上)有没有kubelet可执行程序?按照作者的文章,你要检查/opt/k8s/bin/目录下有没有kubelet
2、检查是否给kubelet-bootstrap用户或者system:bootstrappers组绑定了集群角色;
3、检查是否给apiserver的用户绑定了角色;
步骤2和3你严格按照作者的例子做就是了,当然有些参数是你可以自行修改的。
我自己因为步骤一,忘记拷贝kubelet了导致怎么也查不出问题,其实如果我不用system来启动,估计立刻就能看出问题了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants