You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to use the featureGate NodeRepair, this is correctly enable on my controller pod (FEATURE_GATES: SpotToSpotConsolidation=false,NodeRepair=true), but that doesn't work like I've expected.
We just need to enable the feature gate for work ? I pretty sure we need to wait 30min before the node was considerate as unready but I don't find any documentation on this point.
Also it's look on node object or nodeclaims.karpenter.sh ? Because my node is ready but not my node claim:
╰─➤ k get node ip-10-34-6-212.eu-west-3.compute.internal
NAME STATUS ROLES AGE VERSION
ip-10-34-6-212.eu-west-3.compute.internal Ready <none> 129m v1.30.6-eks-94953ac
╰─➤ k get nodeclaims.karpenter.sh std-linux-cpu-h4g4s
NAME TYPE CAPACITY ZONE NODE READY AGE
std-linux-cpu-h4g4s m7i.8xlarge spot eu-west-3a ip-10-34-6-212.eu-west-3.compute.internal Unknown 129m
Regards,
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
The text was updated successfully, but these errors were encountered:
The node repair feature works by looking at the node readiness. Karpenter only act against nodes that are not ready for 30 min. The nodeclaim being unknown is a known bug that are planning on pushing to fix. The issue there is the status of how karpenter is marking the status rather then any issue with the nodeclaim #7494
With 1.1.1 I've still have Unknown if my node still have the statupTaint
╰─➤ k get nodeclaims.karpenter.sh std-linux-core-dpxmc
NAME TYPE CAPACITY ZONE NODE READY AGE
std-linux-core-dpxmc t3a.medium spot eu-west-3b ip-10-157-24-133.eu-west-3.compute.internal Unknown 51m
╰─➤ k get node ip-10-157-24-133.eu-west-3.compute.internal
NAME STATUS ROLES AGE VERSION
ip-10-157-24-133.eu-west-3.compute.internal Ready <none> 50m v1.30.7-eks-59bf375
╰─➤ k get nodes ip-10-157-24-133.eu-west-3.compute.internal -o json | jq '.spec.taints | length'
1
The node repair feature works by looking at the node readiness
So they have no possibility to automatically destroy the nodeclaim if it's not at True ?
Description
How can the docs be improved?
Hello,
I try to use the featureGate NodeRepair, this is correctly enable on my controller pod (
FEATURE_GATES: SpotToSpotConsolidation=false,NodeRepair=true
), but that doesn't work like I've expected.We just need to enable the feature gate for work ? I pretty sure we need to wait 30min before the node was considerate as unready but I don't find any documentation on this point.
Also it's look on
node
object ornodeclaims.karpenter.sh
? Because my node is ready but not my node claim:Regards,
The text was updated successfully, but these errors were encountered: