Investigate Node Affinity Scheduling #179

davidvonthenen · 2019-04-03T18:19:06Z

Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature

What happened:
Based in part on the work done here:
https://github.com/vmware/vsphere-affinity-scheduling-plugin

davidvonthenen · 2019-04-03T18:19:13Z

/assign

davidvonthenen · 2019-04-19T20:58:36Z

I have been researching this alongside my normal day-to-day work. I should have some ideas to discuss shortly.

davidvonthenen · 2019-05-08T14:15:34Z

There are discussions in SIG Cloud Provider concerning affinity and anti-affinity scheduling as it pertains to the underlying infrastructure. Going to table this for now and see if we can provide input to that effort in order to bite off this feature/functionality.

frapposelli · 2019-07-03T16:10:43Z

/lifecycle frozen

davidvonthenen · 2019-08-05T21:02:41Z

Going to start taking a look at this again. We are re-visit since consensus from SIG Cloud Provider was to do it out of core k8s.

Should still target for post-1.0 release.

sujeet-banerjee · 2019-08-06T09:24:24Z

I have been working on this proposal since the beginning of the year (attached). I am working on putting a patch for the same.
Spec_changes_for_AntiAffinity.docx
Test_n_Demo.pdf

davidvonthenen · 2019-08-06T14:51:19Z

@sujeet-banerjee I read through the doc and it looks like the doc is in relation to cluster api. Maybe I am missing something... There definitely needs to be an understanding about VMs and which physical hosts they are on, but the issue is that the scheduler doesn't know about the backing infrastructure when it comes time to scheduling pods on those worker nodes. The VMs themselves can move around within the cluster because of DRS, node failure and etc. It's also more than compute, but this also concerns fault domains on storage like VSAN.

As example, let's assume you have the VMs distributed based on your doc in an ideal configuration, if you target a statefulset to be deployed to a certain region/zone, it's possible that all pods get placed on different (or even the same) VMs but they are placed in the same VSAN fault domain. If that particular fault domain, dies you will end up losing all your data. This is one such problem this issue is planning on addressing.

The doc has a very cluster api centric view of how to address the problem looking at infrastructure upwards, but this proposed component needs to look at workload placement from the pod view downward. Maybe this proposal can help improve the ease of pod scheduling but having VM sit on hosts in an ideal fashion, but we are still talking about pod scheduling and workload placement within those VMs at the end of the day.

s0uky · 2023-10-12T11:48:59Z

Hi folks, is there any update here?

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 3, 2019

k8s-ci-robot assigned davidvonthenen Apr 3, 2019

frapposelli added this to the Next milestone Jun 5, 2019

frapposelli mentioned this issue Jun 7, 2019

[Feature] Support for anti-affinity/affinity rules for the created machines kubernetes-sigs/cluster-api-provider-vsphere#175

Closed

frapposelli added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jul 3, 2019

k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jul 3, 2019

davidvonthenen mentioned this issue Aug 19, 2019

Expand failure domains beyond region/zone #234

Open

frapposelli removed this from the Next milestone Sep 4, 2019

davidvonthenen added this to the Next milestone Feb 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate Node Affinity Scheduling #179

Investigate Node Affinity Scheduling #179

davidvonthenen commented Apr 3, 2019

davidvonthenen commented Apr 3, 2019

davidvonthenen commented Apr 19, 2019

davidvonthenen commented May 8, 2019

frapposelli commented Jul 3, 2019

davidvonthenen commented Aug 5, 2019 •

edited

Loading

sujeet-banerjee commented Aug 6, 2019 •

edited

Loading

davidvonthenen commented Aug 6, 2019

s0uky commented Oct 12, 2023

Investigate Node Affinity Scheduling #179

Investigate Node Affinity Scheduling #179

Comments

davidvonthenen commented Apr 3, 2019

davidvonthenen commented Apr 3, 2019

davidvonthenen commented Apr 19, 2019

davidvonthenen commented May 8, 2019

frapposelli commented Jul 3, 2019

davidvonthenen commented Aug 5, 2019 • edited Loading

sujeet-banerjee commented Aug 6, 2019 • edited Loading

davidvonthenen commented Aug 6, 2019

s0uky commented Oct 12, 2023

davidvonthenen commented Aug 5, 2019 •

edited

Loading

sujeet-banerjee commented Aug 6, 2019 •

edited

Loading