Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This expression fires the Watchdog alert only if the TSDB is up to date and therefore checks the functionality of the full stack.
According to the Runbook, the intention of the Watchdog alert is, quote:
| This is an alert meant to ensure that the entire alerting pipeline is functional.
vector(1)
fires when Alertmanager and Prometheus Pods are up and running. When either of them is down, with this expression, the Watchdog serves it's purpose as intended.However, when TSDB storage runs full, Watchdog with
vector(1)
will still fire, but all alerts with expressions that depend on a metric will not trigger, because said metrics are now missing. So in short, a dysfunctional Prometheus-Stack goes completely unnoticed.This may also happen in the (although unlikely) case the storage runs full faster than Prometheus can trigger a
KubePersistentVolumeFillingUp
alert.Type of change
What type of changes does your code introduce to the kube-prometheus? Put an
x
in the box that apply.CHANGE
(fix or feature that would cause existing functionality to not work as expected)FEATURE
(non-breaking change which adds functionality)BUGFIX
(non-breaking change which fixes an issue)ENHANCEMENT
(non-breaking change which improves existing functionality)NONE
(if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)Changelog entry
Please put a one-line changelog entry below. Later this will be copied to the changelog file.