Skip to content

Commit

Permalink
Reformat tables. (#3048)
Browse files Browse the repository at this point in the history
  • Loading branch information
mbobrovskyi authored Sep 13, 2024
1 parent f87c51e commit cb18c75
Show file tree
Hide file tree
Showing 3 changed files with 48 additions and 48 deletions.
22 changes: 11 additions & 11 deletions site/content/en/docs/adopters/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,16 @@ If you are using Kueue, feel free to open a pull request to add your organizatio

## Adopters

| Organization | Type | Description | Integrations | Contact |
|:----------------------------------------------------:|:--------:|:----------------------:|:----------------------------------:|:----------------------------------------:|
| [CyberAgent, Inc.](https://www.cyberagent.co.jp/en/) | End User | On-premise ML Platform | batch/job </br> kubeflow.org/mpijob | [@tenzen-y](https://github.com/tenzen-y) |
| [DaoCloud, Inc.](https://www.daocloud.io/en/) | End User | Part of the AI Platform for managing all kinds of Jobs. | batch/job </br> RayJob </br> ... | [@kerthcet](https://github.com/kerthcet) |
| [WattIQ, Inc.](https://wattiq.io) | End User | SaaS/IoT product | batch/job </br> RayJob </br> | [@madsenwattiq](https://github.com/madsenwattiq) |
| [Horizon, Inc.](https://horizon.cc/) | End User | AI training platform | batch/job </br> ... | [@GhangZh](https://github.com/GhangZh) |
| [FAR AI](https://far.ai/) | End User | AI alignment research nonprofit | batch/job | [@rhaps0dy](https://github.com/rhaps0dy) |
| [Shopee, Inc.](https://shopee.com/) | End User | Training/batch inference/data processes in AI platform test env | Customized job </br> RayJob </br> ... | [@denkensk](https://github.com/denkensk) |
| [Mondoo, Inc.](https://mondoo.com) | End User | Helps power Mondoo's hosted security scanner | batch/job | [@jaym](https://github.com/jaym) |
| [Google Cloud](https://cloud.google.com/) | Provider | Part of [kit for training ML workloads on TPUs][gcmldemo] | JobSet | [@mrozacki](https://github.com/mrozacki) |
| [Onna Technologies, Inc](https://onna.com) | End User | Unstructured Data Management Platform | batch/job </br> | [@gitcarbs](https://github.com/gitcarbs) |
| Organization | Type | Description | Integrations | Contact |
|:-------------------------------------------------------:|:--------:|:---------------------------------------------------------------:|:-------------------------------------:|:------------------------------------------------:|
| [CyberAgent, Inc.](https://www.cyberagent.co.jp/en/) | End User | On-premise ML Platform | batch/job </br> kubeflow.org/mpijob | [@tenzen-y](https://github.com/tenzen-y) |
| [DaoCloud, Inc.](https://www.daocloud.io/en/) | End User | Part of the AI Platform for managing all kinds of Jobs. | batch/job </br> RayJob </br> ... | [@kerthcet](https://github.com/kerthcet) |
| [WattIQ, Inc.](https://wattiq.io) | End User | SaaS/IoT product | batch/job </br> RayJob </br> | [@madsenwattiq](https://github.com/madsenwattiq) |
| [Horizon, Inc.](https://horizon.cc/) | End User | AI training platform | batch/job </br> ... | [@GhangZh](https://github.com/GhangZh) |
| [FAR AI](https://far.ai/) | End User | AI alignment research nonprofit | batch/job | [@rhaps0dy](https://github.com/rhaps0dy) |
| [Shopee, Inc.](https://shopee.com/) | End User | Training/batch inference/data processes in AI platform test env | Customized job </br> RayJob </br> ... | [@denkensk](https://github.com/denkensk) |
| [Mondoo, Inc.](https://mondoo.com) | End User | Helps power Mondoo's hosted security scanner | batch/job | [@jaym](https://github.com/jaym) |
| [Google Cloud](https://cloud.google.com/) | Provider | Part of [kit for training ML workloads on TPUs][gcmldemo] | JobSet | [@mrozacki](https://github.com/mrozacki) |
| [Onna Technologies, Inc](https://onna.com) | End User | Unstructured Data Management Platform | batch/job </br> | [@gitcarbs](https://github.com/gitcarbs) |

[gcmldemo]: https://cloud.google.com/blog/products/compute/the-worlds-largest-distributed-llm-training-job-on-tpu-v5e
32 changes: 16 additions & 16 deletions site/content/en/docs/installation/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,22 +243,22 @@ spec:

The currently supported features are:

| Feature | Default | Stage | Since | Until |
|---------|---------|-------|-------|-------|
| `FlavorFungibility` | `true` | Beta | 0.5 | |
| `MultiKueue` | `false` | Alpha | 0.6 | |
| `MultiKueueBatchJobWithManagedBy` | `false` | Alpha | 0.8 | |
| `PartialAdmission` | `false` | Alpha | 0.4 | 0.4 |
| `PartialAdmission` | `true` | Beta | 0.5 | |
| `ProvisioningACC` | `false` | Alpha | 0.5 | 0.6 |
| `ProvisioningACC` | `true` | Beta | 0.7 | |
| `QueueVisibility` | `false` | Alpha | 0.5 | |
| `VisibilityOnDemand` | `false` | Alpha | 0.6 | |
| `PrioritySortingWithinCohort` | `true` | Beta | 0.6 | |
| `LendingLimit` | `false` | Alpha | 0.6 | 0.8 |
| `LendingLimit` | `true` | Beta | 0.9 | |
| `MultiplePreemptions` | `false` | Alpha | 0.8 | 0.8 |
| `MultiplePreemptions` | `true` | Beta | 0.9 | |
| Feature | Default | Stage | Since | Until |
|-----------------------------------|---------|-------|-------|-------|
| `FlavorFungibility` | `true` | Beta | 0.5 | |
| `MultiKueue` | `false` | Alpha | 0.6 | |
| `MultiKueueBatchJobWithManagedBy` | `false` | Alpha | 0.8 | |
| `PartialAdmission` | `false` | Alpha | 0.4 | 0.4 |
| `PartialAdmission` | `true` | Beta | 0.5 | |
| `ProvisioningACC` | `false` | Alpha | 0.5 | 0.6 |
| `ProvisioningACC` | `true` | Beta | 0.7 | |
| `QueueVisibility` | `false` | Alpha | 0.5 | |
| `VisibilityOnDemand` | `false` | Alpha | 0.6 | |
| `PrioritySortingWithinCohort` | `true` | Beta | 0.6 | |
| `LendingLimit` | `false` | Alpha | 0.6 | 0.8 |
| `LendingLimit` | `true` | Beta | 0.9 | |
| `MultiplePreemptions` | `false` | Alpha | 0.8 | 0.8 |
| `MultiplePreemptions` | `true` | Beta | 0.9 | |

## What's next

Expand Down
42 changes: 21 additions & 21 deletions site/content/en/docs/reference/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,34 +13,34 @@ of the system and the status of [ClusterQueues](/docs/concepts/cluster_queue).

Use the following metrics to monitor the health of the kueue controllers:

| Metric name | Type | Description | Labels |
| ----------- | ---- | ----------- | ------ |
| `kueue_admission_attempts_total` | Counter | The total number of attempts to [admit](/docs/concepts#admission) workloads. Each admission attempt might try to admit more than one workload. | `result`: possible values are `success` or `inadmissible` |
| `kueue_admission_attempt_duration_seconds` | Histogram | The latency of an admission attempt. | `result`: possible values are `success` or `inadmissible` |
| Metric name | Type | Description | Labels |
|--------------------------------------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|
| `kueue_admission_attempts_total` | Counter | The total number of attempts to [admit](/docs/concepts#admission) workloads. Each admission attempt might try to admit more than one workload. | `result`: possible values are `success` or `inadmissible` |
| `kueue_admission_attempt_duration_seconds` | Histogram | The latency of an admission attempt. | `result`: possible values are `success` or `inadmissible` |

## ClusterQueue status

Use the following metrics to monitor the status of your ClusterQueues:

| Metric name | Type | Description | Labels |
| ----------- | ---- | ----------- | ------ |
| `kueue_pending_workloads` | Gauge | The number of pending workloads. | `cluster_queue`: the name of the ClusterQueue<br> `status`: possible values are `active` or `inadmissible` |
| `kueue_quota_reserved_workloads_total` | Counter | The total number of quota reserved workloads. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_quota_reserved_wait_time_seconds` | Histogram | The time between a workload was created or requeued until it got quota reservation. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_admitted_workloads_total` | Counter | The total number of admitted workloads. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_evicted_workloads_total` | Counter | The total number of evicted workloads. | `cluster_queue`: the name of the ClusterQueue<br> `reason`: Possible values are `Preempted`, `PodsReadyTimeout`, `AdmissionCheck`, `ClusterQueueStopped` or `InactiveWorkload` |
| `kueue_admission_wait_time_seconds` | Histogram | The time between a workload was created or requeued until admission. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_admission_checks_wait_time_seconds` | Histogram | The time from when a workload got the quota reservation until admission. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_admitted_active_workloads` | Gauge | The number of admitted Workloads that are active (unsuspended and not finished) | `cluster_queue`: the name of the ClusterQueue |
| `kueue_cluster_queue_status` | Gauge | Reports the status of the ClusterQueue | `cluster_queue`: The name of the ClusterQueue<br> `status`: Possible values are `pending`, `active` or `terminated`. For a ClusterQueue, the metric only reports a value of 1 for one of the statuses. |
| Metric name | Type | Description | Labels |
|--------------------------------------------|-----------|-------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `kueue_pending_workloads` | Gauge | The number of pending workloads. | `cluster_queue`: the name of the ClusterQueue<br> `status`: possible values are `active` or `inadmissible` |
| `kueue_quota_reserved_workloads_total` | Counter | The total number of quota reserved workloads. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_quota_reserved_wait_time_seconds` | Histogram | The time between a workload was created or requeued until it got quota reservation. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_admitted_workloads_total` | Counter | The total number of admitted workloads. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_evicted_workloads_total` | Counter | The total number of evicted workloads. | `cluster_queue`: the name of the ClusterQueue<br> `reason`: Possible values are `Preempted`, `PodsReadyTimeout`, `AdmissionCheck`, `ClusterQueueStopped` or `InactiveWorkload` |
| `kueue_admission_wait_time_seconds` | Histogram | The time between a workload was created or requeued until admission. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_admission_checks_wait_time_seconds` | Histogram | The time from when a workload got the quota reservation until admission. | `cluster_queue`: the name of the ClusterQueue |
| `kueue_admitted_active_workloads` | Gauge | The number of admitted Workloads that are active (unsuspended and not finished) | `cluster_queue`: the name of the ClusterQueue |
| `kueue_cluster_queue_status` | Gauge | Reports the status of the ClusterQueue | `cluster_queue`: The name of the ClusterQueue<br> `status`: Possible values are `pending`, `active` or `terminated`. For a ClusterQueue, the metric only reports a value of 1 for one of the statuses. |

### Optional metrics

The following metrics are available only if `metrics.enableClusterQueueResources` is enabled in the [manager's configuration](/docs/installation/#install-a-custom-configured-released-version).

| Metric name | Type | Description | Labels |
| ----------- | ---- | ----------- | ------ |
| `kueue_cluster_queue_resource_usage` | Gauge | Reports the ClusterQueue's total resource usage |`cohort`: The cohort in which the queue belongs<br> `cluster_queue`: The name of the ClusterQueue<br> `flavor`: referenced flavor<br> `resource`: The resource name|
| `kueue_cluster_queue_nominal_quota` | Gauge | Reports the ClusterQueue's resource quota |`cohort`: The cohort in which the queue belongs<br> `cluster_queue`: The name of the ClusterQueue<br> `flavor`: referenced flavor<br> `resource`: The resource name|
| `kueue_cluster_queue_borrowing_limit` | Gauge | Reports the ClusterQueue's resource borrowing limit |`cohort`: The cohort in which the queue belongs<br> `cluster_queue`: The name of the ClusterQueue<br> `flavor`: referenced flavor<br> `resource`: The resource name|
| `kueue_cluster_queue_weighted_share` | Gauge | Reports a value that representing the maximum of the ratios of usage above nominal quota to the lendable resources in the cohort, among all the resources provided by the ClusterQueue. |`cluster_queue`: The name of the ClusterQueue|
| Metric name | Type | Description | Labels |
|---------------------------------------|--------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `kueue_cluster_queue_resource_usage` | Gauge | Reports the ClusterQueue's total resource usage | `cohort`: The cohort in which the queue belongs<br> `cluster_queue`: The name of the ClusterQueue<br> `flavor`: referenced flavor<br> `resource`: The resource name |
| `kueue_cluster_queue_nominal_quota` | Gauge | Reports the ClusterQueue's resource quota | `cohort`: The cohort in which the queue belongs<br> `cluster_queue`: The name of the ClusterQueue<br> `flavor`: referenced flavor<br> `resource`: The resource name |
| `kueue_cluster_queue_borrowing_limit` | Gauge | Reports the ClusterQueue's resource borrowing limit | `cohort`: The cohort in which the queue belongs<br> `cluster_queue`: The name of the ClusterQueue<br> `flavor`: referenced flavor<br> `resource`: The resource name |
| `kueue_cluster_queue_weighted_share` | Gauge | Reports a value that representing the maximum of the ratios of usage above nominal quota to the lendable resources in the cohort, among all the resources provided by the ClusterQueue. | `cluster_queue`: The name of the ClusterQueue |

0 comments on commit cb18c75

Please sign in to comment.