-
Notifications
You must be signed in to change notification settings - Fork 893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubeflow PRs for Group Support: Enabling RBAC Based on User Groups #2910
Comments
@axel7083 it's really great to see how you are approaching this e2e and touching all KF components. Really great work! I think we'll also need to start including in the mix how we'll end up having headers and support in VirtualServices as well. The main problem being how to parse list values (since a user could belong in multiple groups). In any case would you be able to attend the Notebooks WG meeting on June 8 and give an overview of your current work? This will really help ensure we are all on the same page and coordinate the last pieces and also have a proposal that summarizes this architecture. |
@kimwnasptd In a related discussion, it's always been strange that we pass the user-id header around as plain text, rather than using a JWT and extracting the user-id from it. Our current user-id header means that we have to be extremely careful about restricting which traffic can reach the pods, because "impersonating" a user is as simple as passing their user-id in the header. If we swapped to JWT Also, JWT has a claim for "groups", which we can use instead of adding a group-ids header, with similar trust benefits. |
@thesuperzapper exactly! This is where I'm also getting at |
@kimwnasptd Also, using JWT only removes the need to add a specific "groups" header, and does not negate the need to implement group-checking in each of the apps (because right now, everything is only related to the user-id header right now). Regarding the specific implementation (for group-checking) proposed by @axel7083, I am hesitant to use the Groups entity in Kubernetes RBAC, because believe it's not really related to the higher-level concept of a "group kubeflow users". That is to say, the KFAM API currently creates role bindings like This is because our role bindings ALSO affect the cluster itself. If the user-id happens to correspond to the User which the cluster-level auth ( Alternatively, we can keep using ## for "users" (like we currently do, but with a prefix)
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: "user.kubeflow.org:{USER_ID}"
## for "user groups"
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: "groups.kubeflow.org:{GROUP_ID}"
## for "kubeflow service accounts" (just spitballing here)
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: "sa.kubeflow.org:{GROUP_ID}" Obviously, this would break for people who are currently relying on these role bindings to give |
The oidc-authservice have a build-in support for JWT token, but the default behavior does not use them. While investigating the possible solution for improving the authorization system in kubeflow to integrate the Group support, I used the oauth2-proxy as an alternative, I opened a PR in the manifests repository since it does not requires any modification of the components code, it can replace the oidc-authservice: Adding oauth2-proxy as optional alternative to oidc-authservice. (I know that @kimwnasptd already take a look at it, since it has been added to the A notable advantage for using JWT tokens, is the support by istio in AuthorizationPolicy. Here is an example provided by their documentation: apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: require-jwt
namespace: foo
spec:
selector:
matchLabels:
app: httpbin
action: ALLOW
rules:
- from:
- source:
requestPrincipals: ["[email protected]/[email protected]"]
when:
- key: request.auth.claims[groups]
values: ["group1"] In the current implementation I made with all the components, I add to made a Example if the group required is
This is far from being nice, but giving the kubeflow-groups as a list of string separated with comma, like the oidc-authservice is providing, force us to make a work arround. |
Thanks for your comment and details on the possible implication on the cluster security itself ! I have never been working with such high level concept in Kubernetes so I might have a lack of knowledge on those matter, really appreciate ! First for the idea of using a given prefix to define a group while using the User object, I do not understand, how for example given a JWT token with a user-id and groups, would the subject review be done to ensure authorization ? Do you want to loop over all the user's groups and make a Subject review for each of them, by considering them as User ? The advantage of using the Groups in Kubernetes is allowing an easier subject review since it can be made in one request only. If the concern is about the naming, and the risk of touching some internal Kubernetes existing group name, the idea you offer about the prefix, might be the solution ? If the idea for the user would be to propagate something like |
@axel7083 you are right that we would benefit from using the Groups for bindings, as I guess the same idea of adding a prefix (to prevent collisions with real groups) works in group names too, for example: ## for "user groups"
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: "groups.kubeflow.org:{GROUP_ID}" Also, I agree that replacing But using However, you are correct that the istio |
@thesuperzapper I am a bit septic about making each kubeflow resources responsible of parsing the JWT tokens it add a lot of complexity. But the question is very interesting. Possible issues with JWT tokens
The idea of having the user-id and user-groups in plain text bother me too. Improving the authorization and resources access systemMaybe we should take another approach, some ideas: I Remove the Subject review logic from kubeflow componentsInstead of having each components making the subject review for resources, we could create a component responsible of that. It can have a simple endpoint Each kubeflow resources, will now simply have to call the authorize endpoint, by forwarding the authorization header, and add the This can be a pretty nice solution:
Maybe do not propagate the JWT tokenInstead of propagating the JWT token, and asking kubeflow components to forward it back to the authorization component, we could propagate some kind of non-sense, uuid, or random text, or token, that the kubeflow will have to forward to the authorization endpoint. This would add some complexity since it would requires to have a components ahead of the A benefit would be, that if a request to to one kubeflow resource is made by the user, with a long life token, it would not be propagate in the cluster, reducing the risk of leaking. And making them with a short lifespan. II Components sidecarStill in the same idea of removing the JWT parsing from the internal components logics, having a sidecar containers which can either be parsing the JWT and sending the From a security point of vu, an attacker could call Kubeflow resources from one internal components and put whatever header he wants to impersonate anyone. |
While I agree we should discuss how best to dynamically get groups from the user's JWT, I also wanted to share that deployKF has a "virtual group" concept, where you can define groups of users and assign them to specific profiles and access levels. They are "virtual" in the sense that while you define groups, they are rendered down to specific user/email access bindings. See the |
Just pitching in here with my two cents. I am also not a huge fan of the user and group headers as authentication, since impersonation is incredibly easy, intercepting valid users is similarly easy, it doesn't work with other non-Kubeflow services, etc. Instead, I would love to use simple JWT tokens using regular OIDC. Regarding the possible issues with using JWTs as @axel7083 mention:
[...]
It would make sense to create a library which handles shared functionality and configuration, reducing the need for extensive code rewriting with updates and creating new components which integrates with Kubeflow is much easier. Since most (all?) Kubeflow components lives in the same namespace, you could just create a single ConfigMap/Secret which contains auth configuration and is shared among services.
I think validation of the JWT should be configurable in each component. If we don't want each component being responsible for validating JWTs, I think the first suggestion of a separate authorization service would make most sense. Perhaps it could be built into KFAM, since it seems to be somewhat related to that component, and we wouldn't have to create a new component. I am not a huge fan about the suggestion to pass some randomly generated string between Kubeflow components for auth. This would have many of the same issues as the current situation with user/group headers, such as making it extremely annoying to integrate non-standard Kubeflow components into the mix. The suggestion about a authorization sidecar seems like a bad choice as a long-term solution. The Kubernetes community are generally speaking accepting sidecars as a bad practice, since they consume a lot of unnecessary resources (many are surprised about how many resources Istio sidecars reserve, and therefore, cost), they don't really increase security, and day 2 operations gets very annoying. Ps. would it make sense to create a new issue about using JWTs for authentication in favor of user/group headers, and close this issue as "not planned", since the discussion has diverged quite a bit from the title? |
@AndersBennedsgaard thanks you very much for the details and explanation |
For anyone following this issue, I've prepared a first proposal for how to work with JWTs on Kubeflow in #2748 After a proposal on this topic is approved we can keep pushing the discussion on groups. I'm preparing a WIP proposal for groups, which will be just a first step, but let's focus initially on handling JWTs and then we can handle groups. |
Since 1.9.1, the default manifests use oauth2-proxy and send a The This means that we can now start the process of reading this header when:
First, we must allow users to be assigned to a profile based on the content of their We need to update the KFAM API to additionally support a
We also need to introduce a format for a group RoleBinding that can be read by KFAM and used by SubjectAccessReviews. I think the new group RoleBinding should look like this: apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: group-<HASH_OF_GROUP_NAME>-clusterrole-<GROUP_ROLE>
namespace: <PROFILE_NAME>
annotations:
role: <GROUP_ROLE> # "edit" or "view"
group: <RAW_GROUP_NAME>
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubeflow-<GROUP_ROLE>
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: <RAW_GROUP_NAME> For context, here is what the user RoleBindings look like: apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: user-<SAFE_USER_EMAIL>-clusterrole-<USER_ROLE>
namespace: <PROFILE_NAME>
annotations:
## NOTE: KFAM only reads these annotations when checking the user's level of access
## https://github.com/kubeflow/kubeflow/blob/v1.9.1/components/access-management/kfam/bindings.go#L195-L238
role: <USER_ROLE>
user: <RAW_USER_EMAIL>
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubeflow-<USER_ROLE>
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: <RAW_USER_EMAIL> After that, we can update the central dashboard to send the group information when it calls KFAM, and interpret the responses correctly. Here is the code for the central-dashboard's KFAM proxy: Then we would need to update the: main page, namespace selector, and manage contributors pages: We probably also want to allow each "link" on the sidebar of the dashboard to mark itself as supporting groups or not: Here are some tips for people wanting to work on updating Central Dashboard:
|
/transfer manifests |
Descriptions
I am opening a dedicated issue to follow on the PRs opened related to the group support.
The goal is to add support for groups in kubeflow. Currently only supporting RBAC based on user-id. The oidc-authservice will send the "kubeflow-userid" and "kubeflow-groups", but currently the "kubeflow-groups" header is ignored.
Why
Supporting Groups would facilitate the integration of Kubeflow in existing systems. Without the need to create/maintains or delete role bindinds for each user.
Current Status
kubeflow/kubeflow
kubeflow/pipeline
kubeflow/manifest
kubeflow/katib
kserve/models-web-app
This does not requires any modification in its Dockerfile it clone the kubeflow/kubeflow repository and use
components/crud-web-apps/common/backend
Related issues
kubeflow/dashboard#42
kubeflow/kubeflow#5071
kubeflow/kubeflow#4998
/kind feature
/area multiuser
The text was updated successfully, but these errors were encountered: