-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIP-149] Proposal for Kubernetes Operator for Apache Superset #31408
Comments
Thank you for the proposal @villebro. Do we plan to officially support this operator for official releases? If yes, could you enhance the SIP explaining how the Release Process would be affected? |
@michael-s-molina thanks for the feedback. Version support has not been a major issue in the current Helm chart, as it's mostly decoupled from the Superset release process. However, you're right that major changes, like the introduction/removal of new worker types, would definitely cause a breaking change in the operator, too. I will add a section to cover this. |
Thanks. Please consider any necessary changes to RELEASING/README.md. |
@michael-s-molina I think it's actually mostly relevant for the SIP process, rather than the release process. Any major breaking changes or new advanced features that affect how Superset is deployed may affect how the Docker image is built, our Docker Compose flows, and ultimately the Kubernetes deployment model. A few examples:
Therefore, major changes should be handled as follows:
|
Do these typically live in mono-repo or in their own repo? |
@mistercrunch I would place this in a separate repo, similar to what Flink is doing: https://github.com/apache/flink-kubernetes-operator (I would suggest following this pattern: Edit: I added a note about this in the proposal. |
Probably fine to use https://github.com/apache-superset/ org for this, that way you get admin rights and we don't have to consider this tool/repo as an ASF-sanctioned thing that provides all the ASF-related-type constrainst & guarantees |
In some ways this would also make it such that we don't really require a SIP or the SIP process. |
Some pros/cons that come to mind:
I would personally vote to keep this under the ASF GitHub org, but I'm not super opinionated, so I can probably be convinced the other way, too. |
Makes sense, though from my understanding the ASF and its participant can't really officially stamp things like a docker image since it include all sorts of other binaries that we can't/shouldn't certify for legal reasons. The only binaries that are official are the tarballs. As long as it's a "recipe" and not a meal it's fine, meaning say a Dockerfile is fair game, but the docker image itself with a bunch of other binaries in it we can't officially certify or distribute. Guessing the k8s Operator would be mostly a recipe, which would be fine. |
@villebro , I believe this proposal is getting away from the helm finalizers as well and add custom finalizers, ownerreferences for each superset deployment-able manifest for complete lifecycle management of the state as mentioned in crd. |
[SIP-149] Proposal for Kubernetes Operator for Apache Superset
Motivation
Apache Superset's Helm chart [1] [2] is widely used and receives regular contributions, reflecting the popularity of Kubernetes-based deployments within the community. However, Helm's reliance on static templates, duplicated code, lack of built-in testing frameworks, and limited support for advanced lifecycle management makes maintenance of the Helm chart opaque, error prone, and can cause significant downtime risks in large scale deployments relying on it.
This proposal introduces a Kubernetes Operator [3] (hereafter referred to as "the Operator"), offering a Kubernetes-native approach to managing Superset deployments. The Operator will provide similar configuration options to the Helm chart, while addressing its limitations and introducing features like better testing, observability and automation. This proposal aligns with the approach taken by other Apache projects, such as Apache Flink [4] [5] and Apache Druid [6] [7], whose communities have embraced operators to manage their deployments more effectively.
Proposed Change
The Operator will introduce a Custom Resource Definition (CRD) [8] for managing Superset deployments declaratively. Key features include:
The Operator would be placed in a separate repo under the Apache GitHub org, preferably
/apache/superset-kubernetes-operator
. This would make it easier to maintain dedicated CI workflows, and would also decrease traffic on the main repo by having its own set of Releases, PRs and Issues.New or Changed Public Interfaces
values.yaml
in the current Helm chart.Figure 1. A Superset deployment based on the current Helm Chart, where Helm renders manifests based on the
values.yaml
file and Helm chart, and applies them to the target namespace.Figure 2. Diagram depicting the proposed operator based flow, where the Operator is deployed in its own namespace, and continuously reconciles the desired state in the custom Superset resources. The CRD ensures that the Superset manifests are valid and applies defaults as needed.
Changes to SIP and Release Process
To ensure breaking changes to Superset are handled by the Operator, the following changes would need to be done to existing processes:
To keep the releases of Superset and the Operator aligned, we would ensure that that all currently supported Superset versions are backed up by an Operator release. As we're officially maintaining "the latest minor of the last two majors" [10], the Operator would also support these. At the time of writing that would mean 4.1 and 3.1. Note that the Operator version would not track the official Superset version, as breaking changes that require changes to the Operator are fairly uncommon.
New dependencies
The Operator will rely on the Go-based Operator SDK [11] for its implementation and testing framework. Beyond this, it will share the same core dependencies as the existing Helm chart, such as Kubernetes APIs and configurations, but without requiring Helm as a dependency.
Migration Plan and Compatibility
Migrating from the Helm chart to the Operator will be straightforward, as the Operator’s CRD will closely align with the structure of the current
values.yaml
used in the Helm chart. Additionally, the resources created by the Operator will closely mimic those generated by the Helm chart, ensuring consistency and familiarity. Administrators already familiar with managing Superset via Helm will find the transition intuitive.Benefits
Proposed Operator Scope and Deprecation of Helm Chart
We propose deprecating the Helm chart once the Operator is deemed stable to avoid the burden of maintaining both. The Operator will also exclude reconciliation support for PostgreSQL and Redis. Users can continue using Helm for these services or adopt dedicated operators [12] [13], ensuring a more focused approach for managing Superset.
Rejected Alternatives
The text was updated successfully, but these errors were encountered: