You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. What kops version are you running? The command kops version, will display
this information.
1.30.1
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
v1.29.3
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
kops upgrade cluster --name XXX --kubernetes-version 1.29.9 --yes
kops --name XXX update cluster --yes --admin
kops --name XXX rolling-update cluster --yes
5. What happened after the commands executed?
Cluster did not pass validation at the very beginning of the upgrade procedure:
$ kops rolling-update cluster --yes --name XXX
Detected single-control-plane cluster; won't detach before draining
NAME STATUS NEEDUPDATE READY MIN TARGET MAX NODES
control-plane-us-west-2c NeedsUpdate 1 0 1 1 1 1
nodes-us-west-2c NeedsUpdate 4 0 4 4 4 4
I1002 15:03:05.336312 37988 instancegroups.go:507] Validating the cluster.
I1002 15:03:29.806323 37988 instancegroups.go:566] Cluster did not pass validation, will retry in "30s": system-cluster-critical pod "aws-node-termination-handler-577f866468-mmlx7" is pending.
I1002 15:04:22.511826 37988 instancegroups.go:566] Cluster did not pass validation, will retry in "30s": system-cluster-critical pod "aws-node-termination-handler-577f866468-mmlx7" is pending.
[...]
002 15:18:58.830547 37988 instancegroups.go:563] Cluster did not pass validation within deadline: system-cluster-critical pod "aws-node-termination-handler-577f866468-mmlx7" is pending.
E1002 15:18:58.830585 37988 instancegroups.go:512] Cluster did not validate within 15m0s
Error: control-plane node not healthy after update, stopping rolling-update: "error validating cluster: cluster did not validate within a duration of \"15m0s\""
When I looked up why the pod was pending, I found the following in "describe pod aws-node-termination-handler-577f866468-mmlx7":
0/5 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/5 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 4 Preemption is not helpful for scheduling.
There is another aws-node-termination-handler- pod running at the moment (the old one):
$ kubectl -n kube-system get pods -l k8s-app=aws-node-termination-handler
NAME READY STATUS RESTARTS AGE
aws-node-termination-handler-577f866468-mmlx7 0/1 Pending 0 69m
aws-node-termination-handler-6c9c8d7948-fxsrl 1/1 Running 1338 (4h1m ago) 133d
6. What did you expect to happen?
I expected the cluster to be upgraded go Kubernetes 1.29.9
7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
Please see above the validation log.
9. Anything else do we need to know?
Now I would like to know how to recover from this situation and how to get rid of the aws-node-termination-handler-577f866468-mmlx7 pod which is now left in Pending state.
The text was updated successfully, but these errors were encountered:
/kind bug
1. What
kops
version are you running? The commandkops version
, will displaythis information.
1.30.1
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.v1.29.3
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
kops upgrade cluster --name XXX --kubernetes-version 1.29.9 --yes
kops --name XXX update cluster --yes --admin
kops --name XXX rolling-update cluster --yes
5. What happened after the commands executed?
Cluster did not pass validation at the very beginning of the upgrade procedure:
When I looked up why the pod was pending, I found the following in "describe pod aws-node-termination-handler-577f866468-mmlx7":
There is another aws-node-termination-handler- pod running at the moment (the old one):
6. What did you expect to happen?
I expected the cluster to be upgraded go Kubernetes 1.29.9
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
Please see above the validation log.
9. Anything else do we need to know?
Now I would like to know how to recover from this situation and how to get rid of the aws-node-termination-handler-577f866468-mmlx7 pod which is now left in Pending state.
The text was updated successfully, but these errors were encountered: