Adaptive schedule strategy for UnitedDeployment #1720

AiRanthem · 2024-09-02T03:02:25Z

Ⅰ. Describe what this PR does

Added an adaptive scheduling strategy to UnitedDeployment. During scaling up, if a subset causes some Pods to be unschedulable for certain reasons, the unschedulable Pods will be rescheduled to other partitions. During scaling down, if elastic allocation is used (i.e., the subset is configured with min/max), each partition will retain the ready Pods as much as possible without exceeding the maximum capacity, rather than strictly scaling down in reverse order of the Subset list.

Ⅱ. Does this pull request fix one issue?

fixes #1673

Ⅲ. Describe how to verify it

Use the yaml below to create a UD with subset-b unschedulable.

apiVersion: apps.kruise.io/v1alpha1
kind: UnitedDeployment
metadata:
  name: sample-ud
spec:
  replicas: 5
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: sample
  template:
    deploymentTemplate:
      metadata:
        labels:
          app: sample
      spec:
        selector:
          matchLabels:
            app: sample
        template:
          metadata:
            labels:
              app: sample
          spec:
            terminationGracePeriodSeconds: 0
            containers:
              - name: nginx
                image: curlimages/curl:8.8.0
                command: ["/bin/sleep", "infinity"]
  topology:
    scheduleStrategy:
      type: Adaptive
      adaptive:
        rescheduleCriticalSeconds: 10
        unschedulableLastSeconds: 20

    subsets:
      - name: subset-a
        maxReplicas: 2
        nodeSelectorTerm:
          matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - ci-testing-worker
      - name: subset-b
        maxReplicas: 2
        nodeSelectorTerm:
          matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - not-exist
      - name: subset-c
        nodeSelectorTerm:
          matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - ci-testing-worker3

when created, two pods in subset-b will stay pending
after 10s, the two pending pods will be rescheduled to subset-c
scale up immediately, new pods will be created in subset-c instead of subset-b (even not full)
wait 20s, when subset-b is recovered, scale up again, 2 pods will be scheduled into subset-b again (and still pending)
whenever you scale down:
subset-c -> subset-b -> subset-a

Ⅳ. Special notes for reviews

adapter.go: GetReplicaDetails returns pods in the subset
xxx_adapter.go: return pods implementation ⬆️
allocator.go: about safeReplica
pod_condition_utils.go: extract PodUnscheduledTimeout function from workloadwpread
reschedule.go: PodUnscheduledTimeout function extracted
subset.go: add some field to Subset object to carry related information
subset_control.go: store subset pods to Subset object
uniteddeployment_controller.go
1. add requeue feature to check failed pods
2. subset unschedulable status management
uniteddeployment_types.go: API change
uniteddeployment_update.go: sync unschedulable to CR

kruise-bot · 2024-09-02T03:02:30Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign fei-guo for approval by writing /assign @fei-guo in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

codecov · 2024-09-02T03:14:41Z

Codecov Report

Attention: Patch coverage is 49.30876% with 110 lines in your changes missing coverage. Please review.

Project coverage is 49.39%. Comparing base (0d0031a) to head (378185c).
Report is 94 commits behind head on master.

Files with missing lines	Patch %	Lines
.../util/expectations/resource_version_expectation.go	0.00%	23 Missing ⚠️
...er/uniteddeployment/uniteddeployment_controller.go	80.00%	10 Missing and 5 partials ⚠️
...deployment/adapter/advanced_statefulset_adapter.go	0.00%	13 Missing ⚠️
...oller/uniteddeployment/adapter/cloneset_adapter.go	0.00%	13 Missing ⚠️
...ler/uniteddeployment/adapter/deployment_adapter.go	0.00%	13 Missing ⚠️
...er/uniteddeployment/adapter/statefulset_adapter.go	0.00%	13 Missing ⚠️
...roller/uniteddeployment/uniteddeployment_update.go	0.00%	8 Missing and 1 partial ⚠️
pkg/controller/uniteddeployment/allocator.go	80.64%	4 Missing and 2 partials ⚠️
pkg/controller/uniteddeployment/subset_control.go	76.92%	2 Missing and 1 partial ⚠️
pkg/controller/workloadspread/reschedule.go	50.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1720      +/-   ##
==========================================
+ Coverage   47.91%   49.39%   +1.48%     
==========================================
  Files         162      191      +29     
  Lines       23491    19728    -3763     
==========================================
- Hits        11256     9745    -1511     
+ Misses      11014     8719    -2295     
- Partials     1221     1264      +43

Flag	Coverage Δ
unittests	`49.39% <49.30%> (+1.48%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

apis/apps/v1alpha1/uniteddeployment_types.go

pkg/controller/uniteddeployment/adapter/adapter.go

pkg/controller/uniteddeployment/allocator.go

pkg/controller/util/pod_condition_utils.go

pkg/controller/uniteddeployment/adapter/advanced_statefulset_adapter.go

pkg/controller/uniteddeployment/adapter/cloneset_adapter.go

pkg/controller/uniteddeployment/adapter/deployment_adapter.go

pkg/controller/uniteddeployment/adapter/statefulset_adapter.go

pkg/controller/uniteddeployment/uniteddeployment_controller.go

… adapter 1. adapter.go: GetReplicaDetails returns pods in the subset 2. xxx_adapter.go: return pods implementation ⬆️ 3. allocator.go: about safeReplica 4. pod_condition_utils.go: extract PodUnscheduledTimeout function from workloadwpread 5. reschedule.go: PodUnscheduledTimeout function extracted 6. subset.go: add some field to Subset object to carry related information 7. subset_control.go: store subset pods to Subset object 8. uniteddeployment_controller.go 1. add requeue feature to check failed pods 2. subset unschedulable status management 9. uniteddeployment_types.go: API change 10. uniteddeployment_update.go: sync unschedulable to CR Signed-off-by: AiRanthem <[email protected]>

apis/apps/v1alpha1/uniteddeployment_types.go

pkg/controller/uniteddeployment/adapter/adapter_util.go

pkg/controller/uniteddeployment/allocator.go

pkg/controller/uniteddeployment/allocator_test.go

pkg/util/expectations/resource_version_expectation.go

Signed-off-by: AiRanthem <[email protected]>

furykerry · 2024-09-27T09:45:44Z

pkg/controller/uniteddeployment/allocator.go

 	numSubset := len(ac.Spec.Topology.Subsets)
 	minReplicasMap := make(map[string]int32, numSubset)
 	maxReplicasMap := make(map[string]int32, numSubset)
+	notPendingReplicasMap := getSubsetRunningReplicas(nameToSubset)


plz change the local variable and logs e.g. notPendingReplicasMap, notPendingReplicas to runningReplicas...

furykerry · 2024-09-27T10:39:48Z

pkg/controller/uniteddeployment/uniteddeployment_controller.go


+	if controllerutil.AddFinalizer(instance, UnitedDeploymentFinalizer) {
+		klog.InfoS("adding UnitedDeploymentFinalizer")


add uniteddeployment name

furykerry · 2024-09-27T10:50:08Z

pkg/controller/uniteddeployment/uniteddeployment_controller.go

+			// to avoid memory leak
+			klog.InfoS("cleaning up UnitedDeployment", "unitedDeployment", request)
+			ResourceVersionExpectation.Delete(instance)
+			if err = r.updateUnitedDeploymentInstance(instance); err != nil {


if the only clean-up work is to to delete expectation, it is not necessary to add a finalizer. we can just delete the expectation in the event handler. It save the works of add/del finalizer.

furykerry · 2024-09-27T11:33:47Z

pkg/controller/uniteddeployment/uniteddeployment_controller.go

+	}
+
+	// make sure latest version is observed
+	ResourceVersionExpectation.Observe(instance)


we can observe the instance in the update handler of event handler

furykerry · 2024-09-27T11:44:48Z

pkg/controller/uniteddeployment/uniteddeployment_controller_test.go

@@ -124,3 +125,95 @@ func TestReconcile(t *testing.T) {
 	defer c.Delete(context.TODO(), instance)
 	g.Eventually(requests, timeout).Should(gomega.Receive(gomega.Equal(expectedRequest)))
 }
+
+func TestUnschedulableStatusManagement(t *testing.T) {


rewrite the ut using sub test and without gomega

furykerry · 2024-09-27T11:46:46Z

pkg/controller/uniteddeployment/uniteddeployment_controller.go

 	if err != nil {
 		r.recorder.Event(instance.DeepCopy(), corev1.EventTypeWarning, fmt.Sprintf("Failed%s", eventTypeDupSubsetsDelete), err.Error())
 		return nil, fmt.Errorf("fail to manage duplicate Subset of UnitedDeployment %s/%s: %s", instance.Namespace, instance.Name, err)
 	}

+	// If the Fixing scheduling strategy is used, the unschedulable state for all subsets remains false and
+	// the UnschedulableStatus of Subsets are not managed.
+	if instance.Spec.Topology.ScheduleStrategy.IsAdaptive() {


consider moving this code block to Reconcile, getNameToSubset seems to be a simple func that return subset struct from its name, it should not contains complex subset management logic

furykerry · 2024-09-27T11:49:21Z

pkg/controller/uniteddeployment/uniteddeployment_controller.go

+				subset.Status.UnschedulableStatus.PendingPods++
+			}
+			if checkAfter > 0 {
+				durationStore.Push(unitedDeploymentKey, checkAfter)


is it better to only enqueue the earliest to be timeout pod ?

furykerry · 2024-09-27T12:07:05Z

pkg/controller/uniteddeployment/uniteddeployment_update.go

@@ -79,6 +80,17 @@ func (r *ReconcileUnitedDeployment) manageSubsets(ud *appsv1alpha1.UnitedDeploym
 	if updateErr == nil {
 		SetUnitedDeploymentCondition(newStatus, NewUnitedDeploymentCondition(appsv1alpha1.SubsetUpdated, corev1.ConditionTrue, "", ""))
 	} else {
+		// If using an Adaptive scheduling strategy, when the subset is scaled out leading to the creation of new Pods,
+		// future potential scheduling failures need to be checked for rescheduling.
+		var newPodCreated = false


is it really necessary to trigger the requeue here given that manageUnschedulableStatusForExistingSubset had already push some duration time.

furykerry · 2024-09-27T12:08:29Z

pkg/controller/uniteddeployment/uniteddeployment_update.go

@@ -132,6 +144,11 @@ func (r *ReconcileUnitedDeployment) manageSubsetProvision(ud *appsv1alpha1.Unite
 			return nil
 		})
 		if createdErr == nil {
+			// When a new subset is created, regardless of whether it contains newly created Pods,


is it really necessary to trigger the requeue here given that manageUnschedulableStatusForExistingSubset had already push some duration time.

furykerry · 2024-09-27T12:11:59Z

pkg/controller/uniteddeployment/uniteddeployment_controller.go

@@ -348,10 +449,14 @@ func (r *ReconcileUnitedDeployment) classifySubsetBySubsetName(ud *appsv1alpha1.
 	return mapping
 }

-func (r *ReconcileUnitedDeployment) updateStatus(instance *appsv1alpha1.UnitedDeployment, newStatus, oldStatus *appsv1alpha1.UnitedDeploymentStatus, nameToSubset *map[string]*Subset, nextReplicas, nextPartition *map[string]int32, currentRevision, updatedRevision *appsv1.ControllerRevision, collisionCount int32, control ControlInterface) (reconcile.Result, error) {
+func (r *ReconcileUnitedDeployment) updateStatus(instance *appsv1alpha1.UnitedDeployment, newStatus, oldStatus *appsv1alpha1.UnitedDeploymentStatus, nameToSubset *map[string]*Subset, nextReplicas, nextPartition *map[string]int32, currentRevision, updatedRevision *appsv1.ControllerRevision, collisionCount int32, control ControlInterface) error {


there is too many func params in the updateStatus, consider move the calculateStatus to Reconcile, and make updateStatus accept only the instance, new and old status.

kruise-bot requested review from furykerry and veophi September 2, 2024 03:02

kruise-bot added the size/XXL label Sep 2, 2024

AiRanthem force-pushed the feature/ud-adaptive-240827 branch from ea7f490 to e9fd8d4 Compare September 2, 2024 03:05

AiRanthem force-pushed the feature/ud-adaptive-240827 branch 2 times, most recently from e58f279 to 1cc7c87 Compare September 2, 2024 06:45

zmberg added this to the 1.8 milestone Sep 3, 2024

AiRanthem force-pushed the feature/ud-adaptive-240827 branch from 75b4117 to c462401 Compare September 3, 2024 11:44

kruise-bot added size/XL size/XL: 500-999 and removed size/XXL labels Sep 3, 2024

AiRanthem force-pushed the feature/ud-adaptive-240827 branch from c462401 to 68e8262 Compare September 4, 2024 02:04

kruise-bot added size/XXL and removed size/XL size/XL: 500-999 labels Sep 4, 2024

AiRanthem force-pushed the feature/ud-adaptive-240827 branch from 48f50fd to 9cb391b Compare September 9, 2024 09:51

github-advanced-security bot found potential problems Sep 9, 2024

View reviewed changes

apis/apps/v1alpha1/uniteddeployment_types.go Fixed Show fixed Hide fixed

AiRanthem force-pushed the feature/ud-adaptive-240827 branch 2 times, most recently from cba277a to a50e39e Compare September 9, 2024 11:25

furykerry reviewed Sep 12, 2024

View reviewed changes

apis/apps/v1alpha1/uniteddeployment_types.go Outdated Show resolved Hide resolved

apis/apps/v1alpha1/uniteddeployment_types.go Outdated Show resolved Hide resolved

apis/apps/v1alpha1/uniteddeployment_types.go Outdated Show resolved Hide resolved

AiRanthem force-pushed the feature/ud-adaptive-240827 branch 2 times, most recently from b37c453 to 300e67c Compare September 23, 2024 03:59

furykerry reviewed Sep 23, 2024

View reviewed changes

AiRanthem force-pushed the feature/ud-adaptive-240827 branch from 79686a4 to fb5c448 Compare September 25, 2024 05:50

AiRanthem force-pushed the feature/ud-adaptive-240827 branch from fb5c448 to 378185c Compare September 25, 2024 10:02

furykerry reviewed Sep 27, 2024

View reviewed changes

fix conversation

55a55dd

Signed-off-by: AiRanthem <[email protected]>

furykerry reviewed Sep 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive schedule strategy for UnitedDeployment #1720

Adaptive schedule strategy for UnitedDeployment #1720

AiRanthem commented Sep 2, 2024 •

edited

Loading

kruise-bot commented Sep 2, 2024

codecov bot commented Sep 2, 2024 •

edited

Loading

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024

furykerry Sep 27, 2024


		if controllerutil.AddFinalizer(instance, UnitedDeploymentFinalizer) {
		klog.InfoS("adding UnitedDeploymentFinalizer")

Adaptive schedule strategy for UnitedDeployment #1720

Are you sure you want to change the base?

Adaptive schedule strategy for UnitedDeployment #1720

Conversation

AiRanthem commented Sep 2, 2024 • edited Loading

Ⅰ. Describe what this PR does

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

kruise-bot commented Sep 2, 2024

codecov bot commented Sep 2, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AiRanthem commented Sep 2, 2024 •

edited

Loading

codecov bot commented Sep 2, 2024 •

edited

Loading