Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose security model for impersonation/tenancy #582

Closed
wants to merge 1 commit into from

Conversation

stealthybox
Copy link
Member

@stealthybox stealthybox commented Dec 11, 2020

This proposal describes a potential API design /w supporting controller implementation details for secure, multi-tenant API primitives within Flux.

The examples clarify the utility of certain changes which may at first seem esoteric.
This implementation should allow people using and building platforms on Flux to make concise choices toward implementing their own security needs and behavioral constraints.

Ideally this document and the surrounding conversations can graduate to becoming documentation for various audiences.

The ideas represented here are the work of many folks -- not limited to @stefanprodan @squaremo @hiddeco @jonathan-innis and myself.
For previous comments, see #263

@stealthybox stealthybox force-pushed the security-model branch 2 times, most recently from 0d3e27d to bccad66 Compare December 11, 2020 05:16
Signed-off-by: leigh capili <leigh@null.net>
Copy link
Member

@squaremo squaremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very thorough and well thought through*, excellent work Leigh.

I've given suggestions for tightening it up; I don't have substantial disagreement with anything that's suggested here. I'll give my thoughts on the particular decisions in the main thread.

(*not a poet)


Flux 2's installation model is built from CRD's.
This allows cluster operators to granularly control who can create/modify Sources and Applier type objects such as
GitRepository and Kustomization and for which Namespaces.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GitRepository and Kustomization and for which Namespaces.
GitRepository and Kustomization, and for which Namespaces.

This allows cluster operators to granularly control who can create/modify Sources and Applier type objects such as
GitRepository and Kustomization and for which Namespaces.
This also helps out the Deployment model.
While we could for example, still run independent instances of ex:kustomize-controller in each Namespace listening to only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't catch what the shorthand ex:kustomize-controller stands for.

cluster-admin ClusterRoleBinding.
This means cluster owners need to be careful who can create Kustomizations and HelmReleases as well as which Sources are in the cluster
due to the exposure they provide to act as cluster-admin.
This restricts who can create Source and Apply type objects into a burdensome approval flow for cluster owners and provides
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This restricts who can create Source and Apply type objects into a burdensome approval flow for cluster owners and provides
This restricts who can create Source and Apply type objects into a burdensome approval flow for cluster owners, and provides

This means cluster owners need to be careful who can create Kustomizations and HelmReleases as well as which Sources are in the cluster
due to the exposure they provide to act as cluster-admin.
This restricts who can create Source and Apply type objects into a burdensome approval flow for cluster owners and provides
poor delegation of self-service permissions which is a proven elements of good platforms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence provoked a double-take -- it's not clear from the syntax whether poor delegation of self-service permissions is a proven element of good platforms, or good delegation [...] would be. The meaning is obvious after the fact, but perhaps it could be rewritten to flow better.

This means cluster owners need to be careful who can create Kustomizations and HelmReleases as well as which Sources are in the cluster
due to the exposure they provide to act as cluster-admin.
This restricts who can create Source and Apply type objects into a burdensome approval flow for cluster owners and provides
poor delegation of self-service permissions which is a proven elements of good platforms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
poor delegation of self-service permissions which is a proven elements of good platforms.
poor delegation of self-service permissions which is a proven element of good platforms.

This is sensible because the User for the remote cluster must match the source cluster's reconciling Namespace.
If no User is specified for the remote cluster, the default FluxUser is impersonated to fetch the sourceRef from the source cluster.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth talking about transitive access to secrets via sources, here? E.g., should an impersonated user need access to a secret referenced by a source. If it's a non-issue, it'd be good to at least say why it's a non-issue, to cover that off.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is worth mentioning. I don't think a FluxUser with access to a Source should also have access to the Source's fetch credential Secret.

It's not sufficient to gate the creation of `HelmCharts` based off of a `sourceRef` permission check in `HelmRelease`;
RBAC rules can change at any time, so the source-controller API client should have the constraints of the impersonated User.

If a kubeConfig is specified for a resource that specifies a sourceRef, a separate client matching the same User or SA from the remote cluster is used in the source cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, I can see it causing confusion -- you need the same user to be defined in both clusters, with quite different permissions (in the source cluster, it needs to see all the sources; in the remote cluster, it needs to create things in the destination namespace).

An alternative might be to name the remote user to impersonate (and change the structure of the CRD to make this obvious). But: additional complexity, certainly.



#### sourceRefs
Controllers should impersonate the reconciler User when fetching sourceRefs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider the effect of removing cross-namespace references? It would still be necessary to impersonate; but perhaps easier to reason about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is probably going to be the most controversial part. Removing namespace references will make RBAC much simpler while at the same time making a lot of people angry as they have built use cases around having the ability to do so. My guess is that we will not be able to remove it so the solution needs to be as fool proof as possible.

Flux controllers will be using many more Kubernetes client Objects, informers, and caches.
Storing a map of Flux Users to clients/caches might be a good first step.
kubeConfig client/caches are not stored currently, which was an explicit decision to support refreshing clients,
this this doesn't have to be true for all kubeConfigs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this this doesn't have to be true for all kubeConfigs.
but this doesn't have to be true for all kubeConfigs.



#### opening Source policies with RBAC
Accessing Sources across Namespaces is denied by defaultl.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Accessing Sources across Namespaces is denied by defaultl.
Accessing Sources across Namespaces is denied by default.

@stefanprodan
Copy link
Member

stefanprodan commented Dec 15, 2020

How are we going to address multi-tenancy when connecting to Git, container registries and key management services using IAM roles? If we bind the IAM roles of all tenants to the Flux controllers pods, then a tenant can access another tenant's KMS keys, registries, etc.

similar to Pods specifying non-default serviceAccounts from the same Namespace.
- **Con**: RoleBinding against a FluxUser requires specifying the namespace in the username string which can be hard to do in a generic way with kustomize -- would need to document a kustomize example

Flux controllers will also impersonate two Groups.
Copy link
Contributor

@jonathan-innis jonathan-innis Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unclear on who is adding the ClusterRoleBinding on the flux:users and flux:users:{{namespace}} groups to give the controllers impersonation permissions. Is the onus on the end-user to create this ClusterRoleBinding after the controllers are deployed or is there some proposed automated process to create these bindings?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see https://github.com/fluxcd/flux2-multi-tenancy#onboard-tenants

TLDR: there is a flux create tenant that creates the user or serviceaccount to be impersonated. At install time, the flux bootstrap command will setup the service account and RBAC for the controller pods to enable impersonation.

@jonathan-innis
Copy link
Contributor

💯 From the Microsoft Team. Looks good to us, thanks for the discussion @stealthybox

@squaremo
Copy link
Member

squaremo commented Mar 4, 2021

I would love to see this implemented. Without wanting to erect more barriers in the way of that happening, I think there are some lingering worries over how much RBAC manipulation it would push onto users. Is it possible to show some worked examples, e.g., here is how the RBAC would need to be set up for multi-tenancy (and each tenant), here is how it works if you don't care about multi-tenancy. @stefanprodan I think you had a Helm-related scenario you weren't sure about, too?

@stealthybox
Copy link
Member Author

stealthybox commented Mar 8, 2021

here is how the RBAC would need to be set up for multi-tenancy (and each tenant)

The examples section attempts to cover this lightly.
The gist of it is that whoever creates/administers a Namespace RoleBinds any tenant FluxUser from any other Namespace to admin or whatever Role they please.

here is how it works if you don't care about multi-tenancy.

Give every FluxUser access to all Sources:

---
kind: ClusterRoleBinding
metadata:
  name: flux-users-view-sources
roleRef:
  kind: ClusterRole
  name: flux-source-viewer
subjects:
  - kind: Group
    name: flux:users

Going even further...

Give every FluxUser cluster-admin:

---
kind: ClusterRoleBinding
metadata:
  name: flux-users-cluster-admin
roleRef:
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: Group
    name: flux:users

Opening up Sources cluster-wide could make sense to me for some niche use-cases.
For cluster-admin, I don't feel good about encouraging this kind of cluster-wide free-for-all.

Modifying flux create kustomization to output an admin RoleBinding for the default FluxUser would go a long way I think in helping people do the right thing without thinking about it at all.
We could add some flags to help add-on more "child" Namespace RoleBindings.

@stealthybox
Copy link
Member Author

The unfortunate bit is that Namespace FluxUsers have to be templated into their RoleBinding so that their namespace matches, so you can't just copy the same exact RoleBinding into every Namespace without some tool capable of doing that.

( RoleBindings have this built-in as a special case for ServiceAccounts )

https://kubernetes.io/blog/2020/08/14/introducing-hierarchical-namespaces/
Is able to copy the same object down to all child Namespaces, but it's not available by default in just any cluster.

From a practical sense, whoever sets up the tenants will likely just extend a Kustomize base that has the RoleBinding baked in.
They just patch the GitRepository and Kustomization path to start a new tenant.

Copy link

@simongottschlag simongottschlag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts from me. I am starting to like the impersonation implementation.

Great write up! 😁👍

All usernames impersonated by kustomize-controller and helm-controller for managing resources will be Namespaced relative to the `Kustomization`/`HelmRelease` object.
This is similar to the mechanism Kubernetes uses for [ServiceAccount User names](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#referring-to-subjects)).
Kubernetes uses `system:serviceaccount:{{namespace}}:{{metadata.name}}` for ServiceAccounts.
The Flux username format should be `flux:user:{{namespace}}:{{spec.user}}`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What use cases do we see for making the name configurable? I can't see anything other than making it possible to lower the impact of something by decision of the creator.

If tenant-A grants tenant-B access (edit role) to the whole namespace tenant-A for reconciler but then grants another role (pod-deleter) to the user pod-deleter, then it would be up to tenant-B to know and decide on which one to use.

If there isn't a strong enough case for these scenarios - maybe the user spec isn't needed and a default one is always used?

Alternatively, a ClusterRoleBinding on the reconciler's namespaced User name can grant access to all Namespaces.

#### serviceAccount impersonation
`spec.serviceAccountName` will remain supported, but impersonating Users is preferable because it reduces the attack surface within a Namespace.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be an option to remove it?

If the controller starts up without impersonation permissions, it can fall back to using the pod service account ("namespaced mode" or cluster admin, up to how the SA permissions are configured).

Copy link
Member

@stefanprodan stefanprodan Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an option IMO, many users are running Flux in production with this setting, not to mention that flux1 multi-tenancy was based on SAs, we’ll be cutting off all these users which I find unacceptable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefanprodan @stealthybox a lot of the changes will already need a migration path - if this makes the product easier and more secure, wouldn't it be worth it? We won't be able to do it after the fact.

Do we see a reason maintaining two ways of doing multi-tenancy? Seems like a lot of work keeping track of it just because of compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a deprecation path but not removed immediately. Removing serviceaccountname immediately will make it difficult for existing users to migrate. One solution could be to look into adding some deprecation notice to convince users to migrate over to users.

ServiceAccounts still have a Group that you can rolebind against: `system:serviceaccounts:{{namespace}}`.

#### defaulting
Behavior must be clarified when both `user` and `serviceAccountName` are provided.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't need different usernames per namespace and remove the ability to impersonate SAs but use the pod SA - then maybe this isn't even needed?

If `user` and `serviceAccountName` are unset/empty-strings, the default FluxUser (proposed: `reconciler`) is used.
Alternatively, we could use [CRD defaulting](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#defaulting), but this puts other considerations on the design with detecting empty values.

#### kubeConfig

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to do a kubeConfigTemplate the same way we have constraintTemplates with OPA-Gatekeeper?

Then maybe we could limit what is actually run inside the pod in a safer manner?

This allows for users to open up Source access policies via RBAC, rather than relying on a Gatekeeper or Kyverno installation to restrict default Flux behavior.
ClusterRoles should exist for the Source controller API's `flux-source-viewer`, `flux-gitrepo-viewer`, `flux-helmrepo-viewer`, `flux-bucket-viewer`, etc.
FluxUsers should be bound explicitly against these ClusterRoles, either with RoleBindings (for single Namespaces) or ClusterRoleBindings.
Cross-Namespace access of Sources can be enabled per Namespace or across the whole cluster via RoleBindings/ClusterRoleBindings to a FluxUser, Flux Group, or All FluxUsers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be a point to not have cross-namespace references at all and instead allow users to utilize tools like kubed: https://github.com/appscode/kubed

I still think that Cluster-scoped resources should be used instead of doing cross-namespace refs for namespaced resources: #497

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not being able to share sources like HelmRepositories has serious implications to resource usage and cost. If users have to register e.g. Bitnami on each namespace, then source-controller will need many GB of memory, not to mention all the egress traffic people have to pay for to pull the same artifacts into each namespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the resource cost because of the model, or because of the implementation? (or to ask another way: is it possible to rewrite the controllers in such a way that HelmRepositories are not duplicated in memory, without changing the visible behaviour?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squaremo I don’t think this is doable, we use the Helm getter package that loads the index into memory. We’ll need some kind of locking between parallel reconciliations of different charts and that seems like a controller-runtime anti pattern, also if you have multiple sources pointing at the same helm repo, each will run on its own interval. Maybe @hiddeco knows a way a around this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been summoned.

The indexes are put into memory mostly when a HelmChart is reconciled. This loads the index into memory so the latest chart can be looked up based on the SemVer selector, and I see very little roam to work around this in a sane way. I do however think that the resource consumption will not increase much, as there will always be at most 8 repository indexes in memory, given we have a worker limit of 4 per reconciler.

My personal concern around resource consumption is more with the in-memory actions we perform on Git repositories, people have started to put their manifests next to their applications, applications which in some cases include e.g. training models or other large files, causing the source-controller to OOM kill.

I still think that Cluster-scoped resources should be used instead of doing cross-namespace refs for namespaced resources

Have suggested this to Stefan before, and think this is indeed the way to deal with it given how Kubernetes wants you to work, but there were some concerns which I can not recall at the moment.

Copy link
Member

@stefanprodan stefanprodan Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Secrets are namespaced so a cluster scope source will need a secretRef.namespace. I’m not against having all the sources kinds duplicated as ClusterGitRepository/ClusterHelmRepository/ClusterBucket but I don’t see how it improves the multi-tenancy story, you would need a cluster role binding for the sources and other bindings to read the secrets from different namespaces, doesn’t make it worse?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Progress on kubernetes/kubernetes#70147 would indeed have been nice.

However, I think that a single reference to a namespaced resource in a Cluster<something> resource is better than multi-mapping and/or duplicating everything across resources.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion is that it's better to reference a namespaced resource from a cluster scoped resource instead of breaking how namespaced resources are supposed to work.

The resource consumption, if it becomes a problem, can be solved by creating a Cluster scoped resource (but then maybe up to the fleet admin to do it).

We want to make it easy to do the right / secure thing.

## Decisions
Direction should be decided on the following:

- [ ] `flux:user` vs. `system:flux:user` User prefix

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flux:user

Direction should be decided on the following:

- [ ] `flux:user` vs. `system:flux:user` User prefix
- [ ] `flux:users` vs. `flux:user` Group prefix

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flux:users


- [ ] `flux:user` vs. `system:flux:user` User prefix
- [ ] `flux:users` vs. `flux:user` Group prefix
- [ ] default FluxUser name (proposed: "reconciler")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reconciler or maybe system? Should have a status that writes out the full name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or flux:user:default

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern /w default is users conflating it with the default SA in the same NS.
Otherwise it's a pretty good name.

Copy link

@maxgio92 maxgio92 May 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be the default impersonated user the specific controller impersonate when reconciling? Or I misunderstood something?
Thanks

#### notifications
notification-controller needs to be able to impersonate FluxUsers and ServiceAccounts.
`Recievers` specify `resources` which are Sources -- `Alerts` specify `eventSources` which can be Sources or Applies.
Both impersonation fields should also be added to `Recievers` and `Alerts` to allow for the same kind of access polices possible with sourceRefs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Alerts also reference providers.

Copy link
Member Author

@stealthybox stealthybox Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

providerRefs are in the same Namespace, so this seems okay to me?
There isn't any reason we couldn't also guard same-Namespace refs with the same FluxUser/SA though.
It's purely a user-experience/granular-permissions thing.

`Recievers` specify `resources` which are Sources -- `Alerts` specify `eventSources` which can be Sources or Applies.
Both impersonation fields should also be added to `Recievers` and `Alerts` to allow for the same kind of access polices possible with sourceRefs.

This will prevent any cross-tenant misuse of the notification API's: notably denial-of-service through Recievers and information-disclosure through Alerts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some issues with denial-of-service will be mitigated with the implementation of rate limiting, but this is still a good point. Another point is that NC should be protected by a network policy within the cluster, unless you expose receiver through an ingress.

Some reconciler ServiceAccounts may need `flux-source-viewer` bound to them.
Others may prefer to update their serviceAccountNames to user.
Migration/Upgrade documentation will be necessary.
It's possible to build a tool-assisted migration that modifies resources in a git repo. (`flux migrate <git-repo-path>` ?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am skeptical but it needs to be researched if this will even be possible to do this with the Terraform provider.

@simongottschlag
Copy link

simongottschlag commented Apr 16, 2021

@stealthybox @stefanprodan @squaremo @phillebaba

Suggestion from yesterday:

Source Reflection

The name Reflection is a work in progress. Could be Copy or Reference, but using Reflection right now.

Custom Resource Definitions

Source

A new parameter (array) will be added called reflections. Maybe it will be required to have GitRepositoryReflection and HelmRepositoryReflection etcetera.

apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
  name: example-repo
  namespace: tenant-a-namespace
spec:
  interval: 1m
  url: https://github.com/example/repo
  ref:
    branch: main
  reflections:
  - kind: SourceReflection
    namespace: tenant-b-namespace

Source Reflection

SourceReflection will be created in namespace by source-controller. It is not meant to be edited/updated (or allowed) to anyone else.

apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: SourceReflection
metadata:
  name: tenant-a-namespace-example-repo
  namespace: tenant-b-namespace
spec:
  sourceName: example-repo
  sourceNamespace: tenant-a-namespace

Kustomization

New Kind is added as sourceRef.

apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
  name: example-kustomization
  namespace: tenant-b-namespace
spec:
  sourceRef:
    kind: SourceReflection
    name: tenant-a-namespace-example-repo

Role Based Access Control

Cluster Role: View Source Reflection

More information about Aggregated ClusterRoles.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: flux-source-reflection-view
  labels:
    rbac.authorization.k8s.io/aggregate-to-view: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["source.toolkit.fluxcd.io"]
  resources: ["sourcereflection"]
  verbs: ["get", "list", "watch"]

@stefanprodan
Copy link
Member

@simongottschlag how would source-controller create the SourceReflection objects in other namespaces? If we give it cluster access how would you prevent a tenant from creating reflections and "DDOS-ing" other tenants namespaces with bogus objects?

@simongottschlag
Copy link

@simongottschlag how would source-controller create the SourceReflection objects in other namespaces? If we give it cluster access how would you prevent a tenant from creating reflections and "DDOS-ing" other tenants namespaces with bogus objects?

My thought was a ClusterRole and ClusterRoleBinding giving it access to all verbs in all namespaces. What's the difference in a security context of "DDOS-ing" your own namespace with 100s or 1000s of GitRepositories compared to creating references of them in other namespaces? I don't see the DDoS issue since it will affect the same Kubernetes API and the same source-controllers.

@stefanprodan
Copy link
Member

I don't see the DDoS issue since it will affect the same Kubernetes API and the same source-controllers.

So as a tenant, you don't see anything wrong with objects popping up in your namespace made by other tenants?

@simongottschlag
Copy link

I don't see the DDoS issue since it will affect the same Kubernetes API and the same source-controllers.

So as a tenant, you don't see anything wrong with objects popping up in your namespace made by other tenants?

@stefanprodan I actually don't see a problem with it.

If the resource is explicitly used for this, then it would be expected that it's being populated from the outside. If a cluster has problems with tenants abusing something, then it's most likely not something we should have a technical solution for but most likely and organizational one.

If we think about the security, and DDoS is the biggest we can come up with - then it should be fine there as well since we don't introduce something new (or at least I think so) since it's a shared control plane and DOS of ones own namespace will cause just as much issues as DOS in another.

By using reflections, we also make what can be used visually compared to the RBAC case. You will easily see what is being shared to you.

We could also consider using something like ClusterSourceReflection to share something to the whole cluster. Haven't thought that through completely yet though.

@stealthybox
Copy link
Member Author

stealthybox commented Apr 16, 2021

Just noting from our discussion yesterday that I had an idea of combining Source Access-Lists and the RBAC Impersonation solutions for Source Policy.

If source-controller has permission to create RoleBindings, it can convert the in-Source Namespace-Selector/Access-List into RoleBindings for the appropriate FluxGroups.
These RoleBindings could use a pre-created ClusterRole, and would be managed as sub-objects of the Source within the Source Namespace.

You get the benefits of both systems and more:

  • easy API embedding within Source obj
  • high-level dynamic behavior like label selectors
  • source-controller implements the security constraints with native k8s policy, kustomize/helm-controller don't have to implement source-specific accessLists client-side.
  • k8s audit log shows granular access /w errors (no immediate burden to have access log in Source status/Events)
  • behavior is completely opt-in, deny-all by default
  • source-controller can by default RoleBind for allow-all-within-namespace -- platform admins can override via ClusterRole patch
  • platform-admins can still implement their own cluster-wide policies with RBAC RoleBindings/ClusterRoleBindings directly without interfering with the mechanism (can easily allow-all for the whole NS or cluster)
  • future improvements to RBAC improve Source Policy

I think it's the right approach to use RBAC as the primitive for access within the client controllers like helm/kustomize-controller.
If we agree the higher-level behavior is something we want, it can be implemented using the Source spec and some automated RBAC.

Similarly, default RoleBindings for Kustomizations and HelmReleases could really alleviate the UX and migration issues we're all worried about for this change while still giving platform-admins complete control to lock everything down.

@stefanprodan
Copy link
Member

stefanprodan commented May 11, 2021

We could consider implementing an access list on the source side such as:

apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
  name: podinfo
  namespace: flux-system
spec:
  interval: 1m
  url: https://github.com/stefanprodan/podinfo
  accessFrom:
    - namespaceSelector:
        matchLabels:
          tenant: one
    - namespaceSelector:
        matchLabels:
          tenant: two

All Flux reconcilers such as kustomize-controller, helm-controller and image-automation-controller would need a RO cluster role to query the API for namespaces to enforce the policy on the client side.

Another option is for source-controller to generate RBAC and add role bindings for the Flux user in each namespace to grant access to specific sources by name.

@squaremo
Copy link
Member

We could consider implementing an access list on the source side [...]

I like the approach of granting access in the source object itself. I think this would fit well with the impersonation scheme: the source-controller sorts out the RBAC according to the shared access as declared, and so long as controllers use impersonation, they will enforce sharing correctly.

@simongottschlag
Copy link

We could consider implementing an access list on the source side such as:

If we have two teams in two namespaces with no access to each other and team1 makes their source available for team2 - team2 needs to know the name of this source to use it.

I think it would provide a better user experience with some kind of mechanism to provide information to team2 that they have this source available.

Any suggestions?

@hiddeco
Copy link
Member

hiddeco commented May 11, 2021

I think it would provide a better user experience with some kind of mechanism to provide information to team2 that they have this source available.

Out-of-cluster and team-to-team communication and/or documentation. Anything else would either result in advertising a duplicate object in the namespaces that match the selector(s) which results in a lot of overhead, or cluster-scoped objects.

While cluster-scoped objects still sound amazing to me, this creates the issue of secrets being namespace-scoped, as already mentioned in #582 (comment).

@stealthybox
Copy link
Member Author

stealthybox commented May 14, 2021

Since the Source specifies the subject list, the API could support adding plain Users, (not just FluxUser's and FluxGroup's)

Listing arbitrary users would allow you to add dev-team1@example-company.com to the Source access list.
This would grant read access to that User for this Source in the same spirit as the reconciler workloads.

When dev-team1@example-company.com does a flux get source git -A, they will see all of the GitRepos they have access to.
This is not the same as showing exactly what a reconciler has access to, but it's a decent way for an admin to expose that to a team's users.

An additive solution is to give dev-team1@example.com a ClusterRoleBinding to a ClusterRole that allows them to only impersonate their specific FluxGroup (flux:users:dev-team1) for administrative purposes.

This could mean giving the team not just read access to their Sources, but read-write access to everything the reconciler has access to!
This is both good and dangerous -- a great option for folks responding to emergencies.

If the platform-admin has good discipline (and we have good examples), they will actually split up the FluxUser RoleBindings into read-write + read-only, so it would be possible to give dev-team1@example.com read-only access.

We could build this into the proposal so that every FluxUser actually has a read-only component as well...
This could potentially be a useful way to add some defensive programming to the controllers, but it has the usability consequence of adding another Subject that folks need to consider when writing their RBAC.
It's probably unnecessary for the controllers, and good examples mean a platform-admin can do this separation themselves.

Unfortunately, I'm not aware of a way to impersonate a User but tell the APIServer to drop certain verbs.
That would make supporting the use-case of granting read-only visibility to others much easier.
Maybe that feature exists? If not, it's a great KEP topic.

@hiddeco hiddeco assigned hiddeco and unassigned hiddeco May 15, 2021
Comment on lines +50 to +53
Tenants or Namespace owners should be able to act freely without fear that their changes will overreach their permissions.

Flux APIs must structurally prevent privilege escalation and arbitrary code execution.
Flux APIs should be congruent with Kubernetes core API's security model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

** Assuming the proposal is geared towards the Namespace-as-a-Service tenancy model. **

The challenge here from a security perspective is providing privilege escalation and arbitrary code execution prevention guarantees to flux's users.
The proposed changes focuses in providing isolation and least privileges at control plane level only.
Privesc routes from Pod to Host, would not be contained. In the same way as lateral movement through to other tenants would be open for abuse by a malicious tenant.

At the same time, trying to enforce those levels of isolation on behalf of flux's users may become disruptive and break compatibility with existing workloads. That would entail becoming more opinionated around some security primitives (i.e. Pod Security Standards, workload scheduling, etc), which feels beyond flux's scope.
So I am not sure it is worth pursuing privilege escalation and arbitrary code execution prevention as a goal.

In my opinion the proposal could highlight as a goal making flux optimised for multi-tenancy, focusing on a least privileged security model at control plane and better resource utilisation instead.

Copy link
Member

@stefanprodan stefanprodan Oct 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @pjbgf, I think we should limit the scope to “resources reconciliation” and be explicit about what this means. The model proposed here is about giving cluster admins a way to define polices that restrict what a tenant source (Git, Helm, Bucket, etc) can contain and thus limiting the operations that tenants can perform on a cluster. Flux can ensure that a tenant can deploy applications in certain namespaces and only in those. Flux can’t ensure that the applications deployed are themselves secure, this is in scope for network policies, PsP, seccomp profiles, etc.

@stefanprodan
Copy link
Member

Closing this in favour of:

Starting with Flux 0.26 we allow platform admins to enforce tenant isolation as described in the multi-tenancy lockdown documentation.

Thanks @stealthybox for all your proposals and to everyone that contributed with ideas. Please let's continue the discussions started here on the RFC PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants