Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to override service account name for resources #4716

Closed
morozov opened this issue Apr 7, 2021 · 12 comments
Closed

Allow to override service account name for resources #4716

morozov opened this issue Apr 7, 2021 · 12 comments

Comments

@morozov
Copy link

morozov commented Apr 7, 2021

Is your feature request related to a problem? Please describe.
There are several use cases where I'd like to use the same service account name for multiple Kafka Connect clusters:

  1. A set of clusters where each cluster runs a single resource-heavy connector. Multiple clusters are needed to isolate the heavy connectors from each other: if one crashes its worker due to an OOM, the others keep running. All clusters have the same permissions, so maintaining a set of service accounts is burdensome.
  2. There is a cluster built from a stable Docker image running in the development environment. As a developer, I want to deploy my own cluster and experiment. In terms of permissions, it would be the same cluster as the stable one, so I'd like to be able to use the same service account as the stable cluster does.

Describe the solution you'd like
Provide a KafkaConnect resource property that would allow to explicitly define the service account name. The default will remain the deployment name.

Describe alternatives you've considered
Duplicating service accounts.

Additional context
The additional burden comes may from the fact that the duplicated services accounts duplicated in Kubernetes need to be reflected in other subsystems: e.g. HashiCorp Vault.

@morozov
Copy link
Author

morozov commented Apr 26, 2021

@scholzj does the above make sense from the Strimzi design standpoint? It shouldn't be awfully hard to implement but first I'd like to make sure that the change would be welcomed.

@scholzj
Copy link
Member

scholzj commented Apr 26, 2021

TBH, I personally not that eager to have this done. If we let everyone customize everything, it will be a road to hell and make the project unmaintainable. I would prefer to keep driving the customisations through the template section of the custom resource. That does not really support service account right now, but that could be added.

I'm also not sure I understand the idea of this. Do you run many connects which are resource heave consuming Kubernetes resources? Or why does the service account matter?

@morozov
Copy link
Author

morozov commented Apr 26, 2021

I would prefer to keep driving the customisations through the template section of the custom resource. That does not really support service account right now, but that could be added.

As long as it allows customizing the service account, it's fine. I don't have a specific API preference.

Do you run many connects which are resource heave consuming Kubernetes resources?

Yes. Logically, it's a single large cluster (hence, the need for the same service account). But physically, each connector runs in its own single-worker cluster with tweaked resource requests/limits.

@scholzj
Copy link
Member

scholzj commented Apr 26, 2021

As long as it allows customizing the service account, it's fine. I don't have a specific API preference.

Well, that depends on what customizations you need. It would let you set labels and/or annotations on the service account. Not change its name.

Yes. Logically, it's a single large cluster (hence, the need for the same service account). But physically, each connector runs in its own single-worker cluster with tweaked resource requests/limits.

Ok, but what are the connectors doing? Why do they need some special RBAC rights?

@morozov
Copy link
Author

morozov commented Apr 26, 2021

Not change its name.

I see. I'm specifically interested in specifying the name.

Ok, but what are the connectors doing? Why do they need some special RBAC rights?

Those are Debezium MySQL connectors. Each connects to its own MySQL cluster, and they all use Kafka Connect Vault config provider to get MySQL credentials. Vault integrates with Kubernetes, so access to a given path is granted on a per-service-account basis. So each time we add a new cluster/connector, we need to update the Vault configuration to grant the new connector's service account access. This is what I'm trying to avoid by using the same SA for all.

@scholzj
Copy link
Member

scholzj commented Apr 26, 2021

That is a bit weird hack around the service account. So how does it get the Vault credentials out of the SA? Does it use the token to authenticate against Vault? Can you simply use a shared secret for this instead of shared SA?

@morozov
Copy link
Author

morozov commented Apr 26, 2021

So how does it get the Vault credentials out of the SA? Does it use the token to authenticate against Vault?

Yes (documentation).

Can you simply use a shared secret for this instead of shared SA?

In this case, probably yes. But we also have less busy environments where we run multiple connectors in one Connect cluster.

The process of provisioning new connectors is automated. If secrets were implemented as regular Kubernetes secrets and mounted to the Connect workers via the FileConfigProvider, then adding a new secret would require a cluster restart (because the file is mounted to the worker, not to the connector). If one of the other connectors in this cluster performs a database snapshot at this time, the restart will break the snapshot. From this standpoint, vault secrets with their own provider are better manageable than Kubernetes secrets and a FileConfigProvider.

@scholzj
Copy link
Member

scholzj commented Apr 27, 2021

In this case, probably yes. But we also have less busy environments where we run multiple connectors in one Connect cluster.

The process of provisioning new connectors is automated. If secrets were implemented as regular Kubernetes secrets and mounted to the Connect workers via the FileConfigProvider, then adding a new secret would require a cluster restart (because the file is mounted to the worker, not to the connector). If one of the other connectors in this cluster performs a database snapshot at this time, the restart will break the snapshot. From this standpoint, vault secrets with their own provider are better manageable than Kubernetes secrets and a FileConfigProvider.

Why would you need multiple secrets? If single service account is good enough for you to cover all connectors, single secret should be also good enough to cover all connectors, or? At the end, maybe you can just mount the token secret of the single service account ou have and use it, or?

@morozov
Copy link
Author

morozov commented Apr 27, 2021

Why would you need multiple secrets?

Each connector captures data from its own MySQL cluster which has a user for the connector. There's one secret per MySQL cluster.

If single service account is good enough for you to cover all connectors, single secret should be also good enough to cover all connectors, or?

That would require synchronization of the connector user credentials between the MySQL clusters. Currently there's no way or intent to do that.

At the end, maybe you can just mount the token secret of the single service account ou have and use it, or?

As far as I understand, currently, there's no way to mount a volume to the worker (#3693). How would one do that?

If it was possible, we could look at using something else than a Kubernetes service account for authorization at Vault (e.g. something generic as OAuth client ID and secret).

@scholzj
Copy link
Member

scholzj commented Apr 27, 2021

As far as I understand, currently, there's no way to mount a volume to the worker (#3693). How would one do that?

If it was possible, we could look at using something else than a Kubernetes service account for authorization at Vault (e.g. something generic as OAuth client ID and secret).

Sorry, if it wasn't clear ... but that was my idea from the beginning to mount the secret which gets you connected to your Vault and pull the configs from there. For that - if you want to use one service account for that, then one secret with other Vault credentials should be enough as well.

#3693 talks about mounting persistent volumes which is indeed not supported. But you would want to keep this in secret (which at the end is the same the service account is using) - and for that you can use this: https://strimzi.io/docs/operators/latest/full/using.html#type-ExternalConfiguration-reference

@morozov
Copy link
Author

morozov commented Apr 27, 2021

Sorry, if it wasn't clear ... but that was my idea from the beginning to mount the secret which gets you connected to your Vault and pull the configs from there. For that - if you want to use one service account for that, then one secret with other Vault credentials should be enough as well.

Yeah, I rephrased your idea in my own words. Thanks for the suggestion!

#3693 talks about mounting persistent volumes which is indeed not supported.

Yeah, I was thinking that since the service account token is mounted to the pod as a volume (/var/run/secrets/kubernetes.io/serviceaccount/token, ref), we'd need to mount the one you're proposing with a volume as well but it could be just a set of environment variables.

[…] you would want to keep this in secret (which at the end is the same the service account is using) - and for that you can use this: https://strimzi.io/docs/operators/latest/full/using.html#type-ExternalConfiguration-reference

I think that should work. We chose the approach of using Kubernetes service accounts for authentication at Vault because it was on the surface (supported by Vault). But it doesn't have to be this way. Now I have some homework to do.

@morozov
Copy link
Author

morozov commented Jul 8, 2021

With the introduction of strimzi/kafka-kubernetes-config-provider, I believe the issue is no longer relevant. It should be easier for us to migrate from Vault to Kubernetes secrets and grant each Kafka Connect cluster's service account access to a given secret once it's deployed.

Thank you for the discussion and the new configuration provider, @scholzj!

@morozov morozov closed this as completed Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants