-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Job-agnostic API to declare the maximal execution time for a Job #3125
Comments
/assign |
Maybe kueue could set activeDeadlineSeconds on all the pod templates? NVM. I see that you want the limit for the entire job so I don't think pod level is the right fit here. |
Very much in favor. Specifying a deadline for a workload (possibly wall clock time instead of cumulative execution time) would also enable capacity planning and permit developing ordering strategies to, e.g., ensure large workloads have a chance to run without resorting to priorities and preemption. An alternative to workload deactivation could be to dynamically lower the priority of the workload, hence let it run if there is excess capacity and only evict if necessary. |
Yeah, I was thinking that "cumulative execution time" measures wall time. I say it is cumulative because it would add up running wall time from all admission times (say if it is admitted (3min) -> suspended (whatever) -> admitted (3min), this would account to 6min, vs activeDeadlineSeconds which accounts only 3min (last admission time). wdyt?
Yeah, that is the long-term plan. IIUC its counterpart ("-t" in slurm) is used for better scheduling.
Potentially, but it sounds complex (what function should decrease the priorities). Also, some users don't like to use priorities as they have the incentive of setting them as high as possible. |
I like this feature. Actually, users easily violate fairness using the sleep inf command. My main question is, at what time can the calculation start? I guess that we need to add a dedicated field like startTime to the Workload object. |
Additionally, I guess that we need to consider #2737. |
Yeah, we need a new field for that, but I was rather thinking about keeping the accumulated time from the previous admissions (say Then compute exec time as:
I think with startTime we would not account for the time when the job is suspended. WDYT? |
I've opened #3133 as a KEP PR for this, please have a look and let's continue the discussion there. |
I have played a bit with slurm's "-t" option, and it looks like it puts a limit on the cumulative wall clock across "suspend / resume". While we don't need to follow it exactly, it seems like a sensible inspiration. |
I was supposed to reset the startTime when the job is preempted or evicted (StopJob), similar to batch/v1 Job integration. |
What would you like to be added:
A job-agnostic API to set the maximal execution time for a Job.
There are some open questions:
kueue.x-k8s.io/max-exec-time-seconds
)Why is this needed:
Different Jobs CRDs have a field with similar semantics, but there is no standard. For example batch/Job has
spec.activeDeadlneSeconds
, while JobSet does not have such an API for now.We would like to have such an API to use it in kueuectl command as an analog of slurm's "-t" option. Long term (but out-of-scope here) we could use this value to optimize Job scheduling.
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.
The text was updated successfully, but these errors were encountered: