feat: RFC Implementation Supporting ODCR #6198

tvonhacht-apple · 2024-05-14T05:57:11Z

This is a collaboration on implementing the RFC #5716 Supporting ODCRs

Progress

EC2NodeClass
- Capacity Reservation
  - Selector Terms
  - Status update
  - Provisioning
    - with available Capacity Reservation
    - fallback to On-Demand without available Capacity Reservation
  - Consolidation
    - feat: RFC Implementation Supporting AWS On-Demand Capacity Reservations kubernetes-sigs/karpenter#1263 (outdated)

TODOs

[] Agree on instanceMatchCriteria vs type
[] Remove all log statements

Description

Supporting associating ODCR to EC2NodeClass

Add a new field capacityReservationSelectorTerms to EC2NodeClass

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: example-node-class
spec:
  capacityReservationSelectorTerms:
    - # The Availability Zone of the Capacity Reservation
      availabilityZone: String | None
      # The platform of operating system for which the Capacity Reservation reserves capacity
      id: String | None
      # The type of operating system for which the Capacity Reservation reserves capacity
      instanceType: String | None
      # The ID of the Amazon Web Services account that owns the Capacity Reservation
      ownerId: String | None
      # Tags is a map of key/value tags used to select subnets
      # Specifying '*' for a value selects all values for a given tag key.
      tags: Map | None
      # Indicates the type of instance launches that the Capacity Reservation accepts. The options include:
      #    - open:
      #       The Capacity Reservation accepts all instances that have
      #       matching attributes (instance type, platform, and Availability
      #       Zone). Instances that have matching attributes launch into the
      #       Capacity Reservation automatically without specifying any
      #       additional parameters.
      #    - targeted:
      #       The Capacity Reservation only accepts instances that
      #       have matching attributes (instance type, platform, and
      #       Availability Zone), and explicitly target the Capacity
      #       Reservation. This ensures that only permitted instances can use
      #             the reserved capacity.
      type: String | None
status:
  capacityReservations:
    - # AvailabilityZone of the Capacity Reservation
      availabilityZone: String
      # Available Instance Count of the Capacity Reservation
      availableInstanceCount: Integer
      # The date and time at which the Capacity Reservation expires. When a Capacity
      # Reservation expires, the reserved capacity is released and you can no longer
      # launch instances into it. The Capacity Reservation's state changes to expired
      # when it reaches its end date and time.
      endDate: String | None
      # Indicates the way in which the Capacity Reservation ends. A Capacity Reservation
      # can have one of the following end types:
      #   * unlimited - The Capacity Reservation remains active until you explicitly
      #   cancel it.
      #   * limited - The Capacity Reservation expires automatically at a specified
      #   date and time.
      endDateType: String
      # ID of the Capacity Reservation
      id: String
      # Indicates the type of instance launches that the Capacity Reservation accepts. The options include:
      #   - open:
      #       The Capacity Reservation accepts all instances that have
      #       matching attributes (instance type, platform, and Availability
      #       Zone). Instances that have matching attributes launch into the
      #       Capacity Reservation automatically without specifying any
      #       additional parameters.
      #   - targeted:
      #       The Capacity Reservation only accepts instances that
      #       have matching attributes (instance type, platform, and
      #       Availability Zone), and explicitly target the Capacity
      #       Reservation. This ensures that only permitted instances can use
      #       the reserved capacity.
      instanceMatchCriteria: String
      # Instance Platform of the Capacity Reservation
      instancePlatform: String
      # Instance Type of the Capacity Reservation
      instanceType: String
      # Owner Id of the Capacity Reservation
      ownerId: String
      # The date and time at which the Capacity Reservation was started.
      startDate: String
      # Total Instance Count of the Capacity Reservation
      totalInstanceCount: Integer

Adding new label `karpenter.k8s.aws/capacity-reservation-id` nodeClaim/node

In order for Karpenter (and admins) to understand if a NodeClaim/Node is part of an ODCR, Karpenter is adding the annotation karpenter.k8s.aws/capacity-reservation-id containing the Capacity Reservation Id (for example cr-12345678)

This follows closely (does not implement all fields) how EC2 DescribeCapacityReservations can filter.

only exception instanceMatchCriteria is called type

Karpenter will perform validation against the spec to ensure there isn't any violation prior to creating the LaunchTemplates.

How was this change tested?

Does this change impact docs?

Yes, PR includes docs updates
Yes, issue opened: #
No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

netlify · 2024-05-14T05:57:28Z

✅ Deploy Preview for karpenter-docs-prod canceled.

Name	Link
🔨 Latest commit	`28e98c0`
🔍 Latest deploy log	https://app.netlify.com/sites/karpenter-docs-prod/deploys/66fc43ee48ddc40008601ca1

tvonhacht-apple · 2024-05-20T21:27:06Z

pkg/providers/amifamily/resolver.go

@@ -154,12 +157,63 @@ func (r Resolver) Resolve(ctx context.Context, nodeClass *v1beta1.EC2NodeClass,
 				maxPods: int(instanceType.Capacity.Pods().Value()),
 			}
 		})
+
+		zones := scheduling.NewNodeSelectorRequirementsWithMinValues(nodeClaim.Spec.Requirements...).Get(v1.LabelTopologyZone)
+		capacityReservations := []v1beta1.CapacityReservation{}


I think we can handle this within the if below, we dont need to create this variable beforehand

jonathan-innis · 2024-06-04T07:39:25Z

pkg/cloudprovider/cloudprovider.go

@@ -96,6 +96,9 @@ func (c *CloudProvider) Create(ctx context.Context, nodeClaim *corev1beta1.NodeC
 	}
 	instance, err := c.instanceProvider.Create(ctx, nodeClass, nodeClaim, instanceTypes)
 	if err != nil {
+		if cloudprovider.IsInsufficientCapacityError(err) {


If we already get an ICE error back, we shouldn't have to wrap the error again, when we do a check down the line, it should be able to identify that the error is an ICE error, so long as one of the wrapped errors is.

from my testing that wasn't working, as it becomes a normal error see line 102 where its just a string at that point

jonathan-innis · 2024-06-04T07:44:15Z

pkg/providers/amifamily/resolver.go

+
+		zones := scheduling.NewNodeSelectorRequirementsWithMinValues(nodeClaim.Spec.Requirements...).Get(v1.LabelTopologyZone)
+		capacityReservations := []v1beta1.CapacityReservation{}
+		if capacityType == "capacity-reservation" {


If we select a capacity reservation NodeClaim, should we just best effort the NodeClaim launch here and then have it get deleted with an ICE error from Fleet if there isn't any available capacity. The next iteration of GetInstanceTypes should have the updated capacity reservation availability so we shouldn't try and launch with the same offering again on the second attempt

the benefit of this is that it becomes visible in nodeclaims that something is happening?

the benefit of this is that it becomes visible in nodeclaims that something is happening

More that we made the decision to launch it so we should probably follow it through. I get that you are trying to be a little smarter in that we know that the launch won't succeed most likely, but, in general, I'd like to avoid introducing too much modality into the code around specific logic like this.

I'm also thinking about this more and realizing that if we don't find an available capacity reservation for this launch earlier in our Create call (there's a filterInstanceTypes that should look for available offerings) then I'd expect us to return an ICE there. If we have already gotten past that point in code, it seems like that we are safe to proceed with the launch (though we may still need to see exactly which instance type offerings have availability with ODCR when we are creating the launch templates, but I would expect this to come through the GetInstanceTypes() and not be coming through the EC2NodeClass status)

jonathan-innis · 2024-06-04T07:44:37Z

pkg/providers/amifamily/resolver.go

+		zones := scheduling.NewNodeSelectorRequirementsWithMinValues(nodeClaim.Spec.Requirements...).Get(v1.LabelTopologyZone)
+		capacityReservations := []v1beta1.CapacityReservation{}
+		if capacityType == "capacity-reservation" {
+			for _, capacityReservation := range nodeClass.Status.CapacityReservations {


lo.Filter?

jonathan-innis · 2024-06-04T07:48:25Z

pkg/providers/amifamily/resolver.go

+				return nil, cloudprovider.NewInsufficientCapacityError(fmt.Errorf("trying to resolve capacity-reservation but no available capacity reservations available"))
+			}
+		}
+
 		for params, instanceTypes := range paramsToInstanceTypes {


When we group paramsToInstanceTypes above should we also group these with the capacity reservations in mind? This would allow us to keep the same logic on L178 where we iterate through the params and instance types, and I think this may work better as well since there should only be specific instance types that are valid for a given capacity reservation

jonathan-innis · 2024-06-04T07:50:31Z

website/content/en/preview/getting-started/migrating-from-cas/scripts/step04-controller-iam.sh

@@ -71,6 +71,12 @@ cat << EOF > controller-policy.json
            "Resource": "arn:${AWS_PARTITION}:eks:${AWS_REGION}:${AWS_ACCOUNT_ID}:cluster/${CLUSTER_NAME}",
            "Sid": "EKSClusterEndpointLookup"
        },
+        {
+            "Effect": "Allow",
+            "Action": "eks:DescribeCapacityReservations",


Suggested change

"Action": "eks:DescribeCapacityReservations",

"Action": "ec2:DescribeCapacityReservations",

jonathan-innis · 2024-06-04T08:11:25Z

pkg/errors/errors.go

@@ -48,6 +48,7 @@ var (
 		"UnfulfillableCapacity",
 		"Unsupported",
 		"InsufficientFreeAddressesInSubnet",
+		"ReservationCapacityExceeded",


When we return this type of ICE error, should we short-circuit and update the capacity reservation that we launched with in-place so we don't have to wait for another iteration of the capacity reservation polling to update the instance availability.

If we didn't want to directly update this to 0, we could also use this as a trigger to re-call DescribeCapacityReservations since we know that something has changed since we made the launch decision originally

I think setting it to zero at this time would be ok, then it eventually will refetch anyways, but this would let other evaluation happening at same time, get it immediately too

Yep, I guess it all sort-of comes out in the wash. I'm a little skeptical that we can model this with the existing ICE cache on its own. Minimally, we need to update the ICE cache to reflect the fact that the ICE is only for that particular capacity reservation. Without that, we'd either not try the OD instance in that AZ at all OR we wouldn't try other capacity reservations assigned to that instance type in that zone. Either way, we aren't really reacting correctly to what the CreateFleet call is telling us here.

pkg/apis/v1beta1/ec2nodeclass_status.go

pkg/apis/v1beta1/ec2nodeclass.go

jonathan-innis · 2024-06-20T01:02:01Z

pkg/providers/amifamily/resolver.go

+
+		zones := scheduling.NewNodeSelectorRequirementsWithMinValues(nodeClaim.Spec.Requirements...).Get(v1.LabelTopologyZone)
+		capacityReservations := []v1beta1.CapacityReservation{}
+		if capacityType == "capacity-reservation" {


the benefit of this is that it becomes visible in nodeclaims that something is happening

More that we made the decision to launch it so we should probably follow it through. I get that you are trying to be a little smarter in that we know that the launch won't succeed most likely, but, in general, I'd like to avoid introducing too much modality into the code around specific logic like this.

I'm also thinking about this more and realizing that if we don't find an available capacity reservation for this launch earlier in our Create call (there's a filterInstanceTypes that should look for available offerings) then I'd expect us to return an ICE there. If we have already gotten past that point in code, it seems like that we are safe to proceed with the launch (though we may still need to see exactly which instance type offerings have availability with ODCR when we are creating the launch templates, but I would expect this to come through the GetInstanceTypes() and not be coming through the EC2NodeClass status)

jonathan-innis · 2024-07-02T19:31:11Z

pkg/apis/v1/ec2nodeclass.go

+	//       Reservation. This ensures that only permitted instances can use
+	//       the reserved capacity.
+	// +optional
+	Type string `json:"type,omitempty"`


Should this be an enum and we validate through kubebuilder OpenAPI that this is only one of the valid values. See https://github.com/kubernetes-sigs/karpenter/blob/main/pkg/apis/v1beta1/nodepool.go#L76 for an example of this

jonathan-innis · 2024-07-02T19:32:22Z

pkg/apis/v1/ec2nodeclass_status.go

+	//    * limited - The Capacity Reservation expires automatically at a specified
+	//    date and time.
+	// +required
+	EndDateType string `json:"endDateType"`


Can this just be implied from the presence or lack of the end date time?

jonathan-innis · 2024-07-02T19:32:33Z

pkg/apis/v1/ec2nodeclass_status.go

+	// launch instances into it. The Capacity Reservation's state changes to expired
+	// when it reaches its end date and time.
+	// +optional
+	EndDate *string `json:"endDate,omitempty"`


Suggested change

EndDate *string `json:"endDate,omitempty"`

EndDate *time.Time `json:"endDate,omitempty"`

Can this map to the actual go type?

pkg/apis/v1/ec2nodeclass_status.go

jonathan-innis · 2024-07-08T23:39:34Z

pkg/providers/instancetype/instancetype.go

+			offering.Requirements.Add(scheduling.NewRequirement(v1beta1.LabelTopologyZoneID, v1.NodeSelectorOpIn, subnet.ZoneID))
+		}
+		// TODO: tvonhacht
+		offering.Requirements.Add(scheduling.NewRequirement(v1beta1.LabelCapactiyReservationID, v1.NodeSelectorOpIn, capacityReservation.ID))


How do we make sure that we don't create duplicate offerings for multiple ODCRs that have the same zone and instance type?

jonathan-innis · 2024-07-08T23:42:33Z

pkg/providers/instancetype/instancetype.go

+		offering.Requirements.Add(scheduling.NewRequirement(v1beta1.LabelCapactiyReservationID, v1.NodeSelectorOpIn, capacityReservation.ID))
+		log.FromContext(ctx).WithValues("instanceType", *instanceType.InstanceType, "capacityReservation", capacityReservation.ID, "offering", offering).V(0).Info("offering for capacity reservation and instanceType")
+		offerings = append(offerings, offering)
+		instanceTypeOfferingAvailable.With(prometheus.Labels{


I wonder if we punt on metrics for capacity reservations for instance type availability/price right now. This gets trickier since there's now no longer a concept of "global" availability and price but now availability is dictated on a per NodePool-basis, since the EC2NodeClass dictates the offerings that are surfaced for each one. Maybe we need to add a NodePool dimension to make this more accurate, but this is going to increase the cardinality even more than we already have (and this is a pretty high cardinality metric to begin with)

jonathan-innis · 2024-07-08T23:43:03Z

pkg/providers/instancetype/instancetype.go

+			capacityTypeLabel: capacityType,
+			zoneLabel:         zone,
+		}).Set(float64(lo.Ternary(available, 1, 0)))
+		instanceTypeOfferingPriceEstimate.With(prometheus.Labels{


Same comment here. Maybe worth punting on the metric since it's going to make things more challenging to reason about

jonathan-innis · 2024-07-08T23:43:35Z

pkg/providers/instancetype/metrics.go

@@ -64,6 +64,19 @@ var (
 			zoneLabel,
 		},
 	)
+	instanceTypeOfferingAvailableCapacityReservation = prometheus.NewGaugeVec(


I wonder if we punt on this too. Ideally, we can just model this through our standard offering availabiliyt

jonathan-innis · 2024-07-08T23:45:16Z

pkg/providers/launchtemplate/launchtemplate.go

@@ -58,43 +58,58 @@ type Provider interface {
 	DeleteAll(context.Context, *v1beta1.EC2NodeClass) error
 	InvalidateCache(context.Context, string, string)
 	ResolveClusterCIDR(context.Context) error
+	GetCapacityReservationID(launchTemplateName string) *string


I wonder if rather than relying on a launch template lookup if we can just know the map of instance type/az -> capacity reservation id during launch and then know that if a given instance type/az was chosen that it is a particular capacity reservation id. This might reduce cross-provider look-ups and additional dependencies

github-actions · 2024-07-23T12:05:19Z

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

tvonhacht-apple requested a review from a team as a code owner May 14, 2024 05:57

tvonhacht-apple requested a review from jonathan-innis May 14, 2024 05:57

tvonhacht-apple force-pushed the feature/odcr branch 3 times, most recently from 5c02862 to 6867ca8 Compare May 16, 2024 16:50

tvonhacht-apple mentioned this pull request May 17, 2024

feat: RFC Implementation Supporting AWS On-Demand Capacity Reservations kubernetes-sigs/karpenter#1263

Closed

tvonhacht-apple force-pushed the feature/odcr branch 2 times, most recently from acd61c4 to 9cdb574 Compare May 20, 2024 20:55

tvonhacht-apple commented May 20, 2024

View reviewed changes

tvonhacht-apple force-pushed the feature/odcr branch 2 times, most recently from 96951d1 to 5471bc4 Compare May 21, 2024 05:49

jonathan-innis reviewed Jun 4, 2024

View reviewed changes

jonathan-innis self-assigned this Jun 7, 2024

tvonhacht-apple force-pushed the feature/odcr branch 2 times, most recently from b7e2828 to 0f3ab47 Compare June 18, 2024 20:29

tvonhacht-apple force-pushed the feature/odcr branch 8 times, most recently from b8512cf to 1089159 Compare July 3, 2024 20:46

jonathan-innis reviewed Jul 8, 2024

View reviewed changes

github-actions bot added the lifecycle/stale label Jul 23, 2024

github-actions bot added the lifecycle/closed label Aug 6, 2024

github-actions bot closed this Aug 6, 2024

jonathan-innis reopened this Aug 9, 2024

jonathan-innis added needs-review PRs that are still going through the review process and removed lifecycle/stale lifecycle/closed labels Aug 9, 2024

tvonhacht-apple force-pushed the feature/odcr branch 4 times, most recently from 8a3d6bf to 8b2eb11 Compare October 1, 2024 17:22

RFC Implementation Supporting ODCR

28e98c0

tvonhacht-apple force-pushed the feature/odcr branch from 8b2eb11 to 28e98c0 Compare October 1, 2024 18:48

jonathan-innis assigned njtran and jmdeal and unassigned jonathan-innis Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: RFC Implementation Supporting ODCR #6198

feat: RFC Implementation Supporting ODCR #6198

tvonhacht-apple commented May 14, 2024 •

edited

Loading

netlify bot commented May 14, 2024 •

edited

Loading

tvonhacht-apple May 20, 2024

jonathan-innis Jun 4, 2024

tvonhacht-apple Jun 18, 2024

jonathan-innis Jun 4, 2024

tvonhacht-apple Jun 18, 2024

jonathan-innis Jun 20, 2024

jonathan-innis Jun 4, 2024

jonathan-innis Jun 4, 2024

jonathan-innis Jun 4, 2024

tvonhacht-apple Jun 18, 2024

jonathan-innis Jun 4, 2024

tvonhacht-apple Jun 18, 2024

jonathan-innis Jun 20, 2024

jonathan-innis Jun 20, 2024

jonathan-innis Jul 2, 2024

jonathan-innis Jul 2, 2024

jonathan-innis Jul 2, 2024

jonathan-innis Jul 8, 2024

jonathan-innis Jul 8, 2024

jonathan-innis Jul 8, 2024

jonathan-innis Jul 8, 2024

jonathan-innis Jul 8, 2024

github-actions bot commented Jul 23, 2024

	"Action": "eks:DescribeCapacityReservations",
	"Action": "ec2:DescribeCapacityReservations",

	EndDate *string `json:"endDate,omitempty"`
	EndDate *time.Time `json:"endDate,omitempty"`

feat: RFC Implementation Supporting ODCR #6198

Are you sure you want to change the base?

feat: RFC Implementation Supporting ODCR #6198

Conversation

tvonhacht-apple commented May 14, 2024 • edited Loading

Progress

TODOs

Supporting associating ODCR to EC2NodeClass

Adding new label karpenter.k8s.aws/capacity-reservation-id nodeClaim/node

netlify bot commented May 14, 2024 • edited Loading

✅ Deploy Preview for karpenter-docs-prod canceled.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jul 23, 2024

tvonhacht-apple commented May 14, 2024 •

edited

Loading

Adding new label `karpenter.k8s.aws/capacity-reservation-id` nodeClaim/node

netlify bot commented May 14, 2024 •

edited

Loading