Reload service account keyfile periodically #205

kevincvlam · 2018-09-06T15:33:35Z

Hi,

We run the CloudSQL proxy in our kubernetes cluster as a deployment and sometimes we rotate the secret that is used to provide the credentials file for IAM authentication.

As a result the credentials loaded at start-up of the proxy become invalid and the proxy begins printing invalid credentials errors, but does not error out. What's the recommended way to handle this situation? Is there a way to have the proxy reload the credentials?

My understanding is that mounted secrets are updated automatically, so it's up to the application to respond accordingly:

Mounted Secrets are updated automatically When a secret being already consumed in a volume is updated, projected keys are eventually updated as well. The update time depends on the kubelet syncing period.

The text was updated successfully, but these errors were encountered:

kurtisvg · 2018-09-06T18:34:20Z

Hey @kevincvlam, thanks for bringing this issue to our attention.

We discussed this issue this morning, and decided that currently the only way to reload the credentials would be to restart the container with the proxy inside. This is obviously not an ideal solution, so we are investigating ways we could handle this. Currently we are looking into the following:

Reload credentials hourly during SSL cert refresh
Attempt credentials reload upon receiving an invalid credentials error
Potentially exit with error code if failing to retrieve valid credentials after X minutes

We'll be using this issue to update our progress on this issue.

kevincvlam · 2018-09-10T18:36:06Z

Hey @kurtisvg, thanks for the quick reply, and looking forward to your solution!

Do you have any idea regarding when you expect the issue to be resolved?

kurtisvg · 2018-09-10T19:49:39Z

Unfortunately, I don't have any promises to make at the moment, just that it's in the queue and the team will get to it when we can. If you have any expertise in this area, we are open to contributions.

markvincze · 2019-03-22T16:25:03Z

Hey folks,

This is affecting us as well. The setup we have is that we store the service account key in a Kubernetes secret, which is mounted to the Cloud SQL Proxy sidecar. And we rotate the service account key every day, and replace the content of the secret.

And as far as I understand, when we change the content of the secret, that change is automatically propagated to the mounted file seen by the running proxy container. (probably this is the same setup @kevincvlam described?)

This is not handled by the proxy, so if the mounted key file changes, that's not picked up by the running proxy, right?
Do you have any update on the timeline when this improvement can be expected?

Thanks!

JorritSalverda · 2019-03-25T12:22:42Z

In our golang applications we handle the reloading of the key by reinitialising an instance of the class that uses the service account key with code like this:

dnsService := NewGoogleCloudDNSService(*googleCloudDNSProject, *googleCloudDNSZone)

foundation.WatchForFileChanges(os.Getenv("GOOGLE_APPLICATION_CREDENTIALS"), func(event fsnotify.Event) {
	log.Info().Msg("Key file changed, reinitializing dns service...")
	dnsService = NewGoogleCloudDNSService(*googleCloudDNSProject, *googleCloudDNSZone)
})

See https://github.com/estafette/estafette-google-cloud-dns/blob/09eaf7f4123b6c4a012837f2415893219456d137/main.go#L81-L84 and https://github.com/estafette/estafette-foundation/blob/master/foundation.go#L104-L161 for implementation details.

Works like a charm and relies on the github.com/fsnotify/fsnotify libary, which doesn't bring in too many dependencies.

Addresses GoogleCloudPlatform#205

dhduvall · 2019-11-08T19:11:36Z

I made some changes that address the failures that I've been seeing. It's not comprehensive, and it's pretty hacktastic, but it's survived a day of having Vault rotate the service account keys from underneath it and the new keys mounted into the k8s container. There are three specific points where it can recover: when the credential file is missing or corrupt at startup; at first connection; and failure to rotate the ephemeral cert. I'm sure there are many other places it could fail, but those are the ones I've been running into.

I'm not going to submit a PR in this state, but I figured if anyone else had a need for this, they could take what I have. If it's within shouting distance of being acceptable, though, I can try to polish it up a bit.

gw0 · 2021-06-24T11:11:32Z

Missing ability to reload service account keyfile is still an open issue. The only workaround is described in #770 which is basically:

update the keyfile
stop with kill -s SIGTERM "$PPID";
start again with /cloud_sql_proxy ...

enocom · 2022-11-17T21:56:56Z

Related to #1045.

enocom · 2022-11-17T21:59:13Z

Bumping the priority given the interest here.

gdafl · 2022-11-23T12:12:19Z

Hi all,

This is also affecting us.

I am running cloud_sql_proxy in a sidecar container in a number of our pods.

As soon as I update our secret, cloud_sql_proxy starts failing because it is still using the old secret that it has in memory.

We cannot resort to SIGHUPping the process as the image is prebuilt and controlled by my organisation (and I cannot modify it), but also this is a workaround rather than a solution.

At the moment I have resorted to deleting all active pods after a key renewal (luckily, we only have to do it once a month) but this is obviously a worse workaround to SIGHUP.

Could a more appropriate solution be provided please?

Many thanks!

UnsignedLong · 2022-11-23T12:23:24Z

Hi,

if you are running your workload within GKE you should evaluate "workload identity" as this is the recommended way. With workload identity you don't have to mess around with JSON keys at all.
Nevertheless this issue is still relevant for workloads running outside the Google ecosystem!

enocom · 2022-11-23T16:39:38Z

Workload identity does sidestep these problems and is the best solution if you're running in GKE.

Otherwise, we're probably looking at some kind of watcher implementation based on fsnotify. Perhaps this is something people should have to opt-in to as well with a CLI flag.

gdafl · 2022-11-29T16:35:46Z

Hi,

if you are running your workload within GKE you should evaluate "workload identity" as this is the recommended way. With workload identity you don't have to mess around with JSON keys at all. Nevertheless this issue is still relevant for workloads running outside the Google ecosystem!

Just a quick update, I switched to Workload Identities for our GKE cloud-sql-proxy sidecars and it's working perfectly.

A solution to this issue would still be useful for non-GKE based deployments though.

Many thanks again for the suggestion!

enocom · 2023-02-01T03:12:30Z

It would be helpful to know how many people want this outside of GKE.

If you're running in GKE, then we strongly recommend using workload identity. Otherwise, this might be useful, but again if the ask here is mostly from GKE workloads, then it's probably not a big priority.

gdafl · 2023-02-01T08:25:02Z

It would be helpful to know how many people want this outside of GKE.

If you're running in GKE, then we strongly recommend using workload identity. Otherwise, this might be useful, but again if the ask here is mostly from GKE workloads, then it's probably not a big priority.

Personally, I switched to workload identities as soon as it was suggested which made this issue moot.

I do still think it's a good feature to add though, to align what cloudsql-proxy does with what GKE does when a secret is updated.

Thanks!

enocom · 2023-08-15T21:40:54Z

Given the prevalence of workload identity, we're going to hold off on this feature. If there's interest in the future, please re-open with why it's useful.

UnsignedLong · 2023-08-16T10:24:34Z

I have on premise workloads accessing CloudSQL. As workload identity is unavailable in my (and other) environments I still see a huge benefit in this feature.

enocom · 2023-08-16T16:52:29Z

Re-opening in that case. What are you using to refresh your credentials file?

micahjsmith · 2024-07-16T20:22:24Z

This is still an issue for me. Running proxy on local developer machines. Devs refresh ADC using gcloud auth login --update-adc periodically (say every 16 hrs). A running proxy process does not pick up the refreshed ADC and must be restarted. This impacts the ability to run long-running scripts from local machines.

jackwotherspoon · 2024-07-16T20:41:17Z

@micahjsmith This is still on our radar of todo's but is low priority for us at the moment. But with this comment we will definitely bump it up a bit in our backlog.

Let me try and understand your use-case a bit better. How come you are running the refresh command every 16 hours? Is it to switch the IAM user/service account the Proxy is being run with? (i.e. first starting the Proxy with user@test.com and then running gcloud auth login for user2@test.com?).

If you could provide the reason for refreshing your ADC maybe I can see another option for your case.

kurtisvg added priority: p2 Moderately-important priority. Fix may not be included in next release. Status: Proposal type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Sep 6, 2018

markvincze mentioned this issue Mar 25, 2019

Use the custom application secrets to provide the service account key to the Cloud SQL Proxy estafette/estafette-extension-gke#11

Closed

kurtisvg removed the Status: Proposal label Jun 24, 2019

markvincze mentioned this issue Aug 30, 2019

[Question] If the file pointed to by GOOGLE_APPLICATION_CREDENTIALS is changed, does the SDK pick it up automatically? googleapis/google-cloud-dotnet#3406

Closed

dhduvall added a commit to dhduvall/cloudsql-proxy that referenced this issue Nov 8, 2019

Authentication failures trigger certificate reload

49c5a02

Addresses GoogleCloudPlatform#205

kurtisvg added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Apr 8, 2020

kurtisvg changed the title ~~CloudSQL Proxy Doesn't Update Credentials Upon Rotation Of Secrets~~ Reload service account keyfile periodically May 14, 2020

enocom mentioned this issue May 3, 2021

Allow for dynamic credentials file #770

Closed

kurtisvg added duplicate and removed duplicate labels May 3, 2021

product-auto-label bot added the api: cloudsql label May 6, 2022

kurtisvg removed the api: cloudsql label Jul 12, 2022

enocom added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Nov 17, 2022

enocom closed this as completed Aug 15, 2023

enocom reopened this Aug 16, 2023

blunderbuss-gcf bot assigned enocom Aug 16, 2023

enocom assigned ttosta-google and unassigned enocom Aug 23, 2023

enocom assigned enocom and unassigned ttosta-google Feb 12, 2024

enocom added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Feb 12, 2024

enocom assigned jackwotherspoon and unassigned enocom May 1, 2024

enocom mentioned this issue Aug 12, 2024

Support SA key rotation GoogleCloudPlatform/alloydb-auth-proxy#689

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reload service account keyfile periodically #205

Reload service account keyfile periodically #205

kevincvlam commented Sep 6, 2018 •

edited by enocom

Loading

kurtisvg commented Sep 6, 2018

kevincvlam commented Sep 10, 2018

kurtisvg commented Sep 10, 2018

markvincze commented Mar 22, 2019

JorritSalverda commented Mar 25, 2019

dhduvall commented Nov 8, 2019

gw0 commented Jun 24, 2021

enocom commented Nov 17, 2022

enocom commented Nov 17, 2022

gdafl commented Nov 23, 2022

UnsignedLong commented Nov 23, 2022

enocom commented Nov 23, 2022

gdafl commented Nov 29, 2022

enocom commented Feb 1, 2023 •

edited

Loading

gdafl commented Feb 1, 2023

enocom commented Aug 15, 2023

UnsignedLong commented Aug 16, 2023 •

edited

Loading

enocom commented Aug 16, 2023

micahjsmith commented Jul 16, 2024

jackwotherspoon commented Jul 16, 2024

Reload service account keyfile periodically #205

Reload service account keyfile periodically #205

Comments

kevincvlam commented Sep 6, 2018 • edited by enocom Loading

kurtisvg commented Sep 6, 2018

kevincvlam commented Sep 10, 2018

kurtisvg commented Sep 10, 2018

markvincze commented Mar 22, 2019

JorritSalverda commented Mar 25, 2019

dhduvall commented Nov 8, 2019

gw0 commented Jun 24, 2021

enocom commented Nov 17, 2022

enocom commented Nov 17, 2022

gdafl commented Nov 23, 2022

UnsignedLong commented Nov 23, 2022

enocom commented Nov 23, 2022

gdafl commented Nov 29, 2022

enocom commented Feb 1, 2023 • edited Loading

gdafl commented Feb 1, 2023

enocom commented Aug 15, 2023

UnsignedLong commented Aug 16, 2023 • edited Loading

enocom commented Aug 16, 2023

micahjsmith commented Jul 16, 2024

jackwotherspoon commented Jul 16, 2024

kevincvlam commented Sep 6, 2018 •

edited by enocom

Loading

enocom commented Feb 1, 2023 •

edited

Loading

UnsignedLong commented Aug 16, 2023 •

edited

Loading