Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for a way to determine reliably if a resource is cluster scoped or namespaced #133

Open
reegnz opened this issue Oct 29, 2024 · 4 comments
Labels
community-feedback Asking the community what the best solution might be

Comments

@reegnz
Copy link

reegnz commented Oct 29, 2024

What did you do?

I wanted to split cluster scoped and namespaced resources into separate folders, as described in https://github.com/patrickdappollonio/kubectl-slice/blob/main/docs/why.md

What did you expect to see?

.
├── cluster/
│   ├── aggregate-metacontroller-edit-clusterrole.yaml
│   ├── aggregate-metacontroller-view-clusterrole.yaml
│   ├── compositecontrollers.metacontroller.k8s.io-customresourcedefinition.yaml
│   ├── controllerrevisions.metacontroller.k8s.io-customresourcedefinition.yaml
│   ├── decoratorcontrollers.metacontroller.k8s.io-customresourcedefinition.yaml
│   ├── metacontroller-clusterrolebinding.yaml
│   ├── metacontroller-namespace.yaml
│   └── metacontroller-serviceaccount.yaml
└── namespaces/
    ├── metacontroller/
    ├──── metacontroller-serviceaccount.yaml
    └──── metacontroller-statefulset.yaml

What did you see instead?

By default kubectl-slice doesn't follow the format explained in the why documentation, instead we get a worse format:

.
└── metacontroller/
    ├── clusterrole-aggregate-metacontroller-edit.yaml
    ├── clusterrole-aggregate-metacontroller-view.yaml
    ├── clusterrole-metacontroller.yaml
    ├── clusterrolebinding-metacontroller.yaml
    ├── customresourcedefinition-compositecontrollers.metacontroller.k8s.io.yaml
    ├── customresourcedefinition-controllerrevisions.metacontroller.k8s.io.yaml
    ├── customresourcedefinition-decoratorcontrollers.metacontroller.k8s.io.yaml
    ├── namespace-metacontroller.yaml
    ├── serviceaccount-metacontroller.yaml
    └── statefulset-metacontroller.yaml

I can't seem to find a way to determine if a manifest is namespaced or not. As the .metadata.namespace is not mandatory even on namespaced manifests, I can't simply rely on that information.

At my company I've written something very similar tool in python before discovering this tool, in our internal tool I'm allowing for an override config to declare which resources are namespaced and which ones are cluster scoped, so the script can reliably file the resources into the right folder.

@reegnz
Copy link
Author

reegnz commented Oct 29, 2024

My solution does it like this, which is closer to the why reasoning:

.
├── cluster
│   ├── ClusterRole.external-dns-external-dns.yaml
│   └── ClusterRoleBinding.external-dns-external-dns.yaml
└── namespaces
    └── external-dns
        ├── Deployment.external-dns.yaml
        ├── Service.external-dns.yaml
        └── ServiceAccount.external-dns.yaml

I think I could contribute a behaviour to the tool that is aware of k8s builtin resource scoping (namespaced vs cluster scoped), and allowed to configure kubectl slice to reliably declare which APIs are namespace scoped and which are cluster scoped.

@patrickdappollonio
Copy link
Owner

Hey @reegnz!

Detecting reliably if a resource is cluster-scoped or namespaced is actually quite, quite hard. Even Helm hasn't managed to get this appropriately, but you're more than welcome to try!

The long-story-short about this is that resources change depending on the cluster where they're installed and there's no concept of exclusivity in the Kubernetes world (no hardcoded way to say that foo.internal resources MUST be namespaced, for example) so a bunch of companies could create a CRD called Ingress under the Group corporate.internal where in some of those companies the resource might be namespaced while in others it wouldn't be.

And yes, you might be thinking that somewhat standard Kubernetes resources could be "whitelisted" in a way so the detection happens correctly for those, but even then, it's not a 100% guarantee because cluster operators can change how these resources behave in their own environments.

At the end of the day, the only somewhat accurate option is to use the Discovery API of Kubernetes to ask for that cluster's resources and use that information to drive the decision-making. And while that's perfectly possible, that's where two worlds kind of overlap and I'm not sure we can take that choice deliberately.

Let me put it in an example: if you look at the current usage of kubectl-slice across the Github world we can see you'll notice the app is part of pipelines, and more often than not, CI/CD pipelines.

In those pipelines, "sorting manifests" comes way before we apply them. Even more so, some of those pipelines use a different tool to apply the pipelines to the cluster (think, GitOps, a la ArgoCD or Flux).

If we were to introduce a way to detect "namespace-scoped resources" we would need the Kubeconfig of the target cluster whose data you want to extract (say, a dev environment). Now we're asking everyone that's using this in a pipeline that now they also need to do some lifecycle management: provide secrets, perhaps a RBAC service account or what-have-you in your cloud provider, and on clusters that are behind a private network, a way to access or query it... That now is quite the ask!

Then think about the fact you'll target one cluster. Most companies I've worked with and the stuff I can see on public GitHub often use multiple clusters that might or might not look the same. Perhaps they would slightly differ in the examples I gave you before, where one cluster might have the resource Ingress.corporate.internal to be namespaced, while the other cluster might have it cluster-wide, which leave us back to square one!

The other option I've explored is providing kubectl-slice with a body of knowledge you organize, think like an additional configuration file that maps these things to namespaces or cluster-wide resources but that seems like a lot of code organization and somewhat time-consuming. In the prior example where one cluster might be namespaced and the other one isn't, you would just use a different body of knowledge depending on the target cluster...

All in all, I feel like at that point the feature would become largely unused unfortunately.

The example I provided in the WHY.md file is just to explain the rationale behind this project. Several companies have used the current features to slice them in separate steps (say, exporting CRDs first, then exporting the rest of the resources) thanks to things like --include-kind or --include-name, for example, using your code above, you could make two or more calls:

kubectl-slice --include-kind clusterrole,clusterrolebinding -o cluster/
kubectl-slice --exclude-kind clusterrole,clusterrolebinding -o resources/

All this to say: it sounds like an easy problem but, believe me, we've bounced the idea between a handful of users (Corporate and friends) and we never seem to agree on a solution.

You do seem like you have an idea though and I would be more than happy to review it! My main concern is that a change that would require a kubeconfig, for example, would be a major, major change that would threaten the stability of the project and we don't have the data of how much "used" that feature would be, but based on what I can see in public and private (thanks to corporate users that reach out privately), I have yet to find a major userbase that would need a feature like this, but hey, that might be you! 😄

Let me know how you want to proceed!

@patrickdappollonio patrickdappollonio added the community-feedback Asking the community what the best solution might be label Nov 1, 2024
@patrickdappollonio
Copy link
Owner

As a side note: the main reason too Kubernetes' "server-side apply" exists is also due to these mismatches of "what's true". Their KEP has some good insight about it but it's a bit of a lengthy read.

Also, these threads might shed some light around the nuisances of "namespaced resources" (links in no particular order):

@patrickdappollonio
Copy link
Owner

patrickdappollonio commented Nov 1, 2024

Last note from a friend that works for a Fortune 500 and they actively use this project is that the inability of kubectl-slice to interact with a Kubernetes cluster isn't a bug, but a feature: if we were to enable that option and people would start to use it, major companies would need several extra steps to get the approvals and compliance approvals needed to continue upgrading this tool.

Today they can get away with it because kubectl-slice is just a templating tool. That's something worth keeping in mind.

I'm still not against adding the feature, it would have to be gated enough so the sales pitch to DevSecOps folks or plain security teams makes it for an easy sale as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-feedback Asking the community what the best solution might be
Projects
None yet
Development

No branches or pull requests

2 participants