Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using GCP service account scopes in self hosted runner #894

Closed
rodrigoalmeida94 opened this issue Feb 16, 2022 · 8 comments
Closed

Using GCP service account scopes in self hosted runner #894

rodrigoalmeida94 opened this issue Feb 16, 2022 · 8 comments

Comments

@rodrigoalmeida94
Copy link

I'm attempting to mount a GCS bucket in a self-hosted runner from CML and encountering multiple authentication problems with gcsfuse.

We are using this definition for our cml runner:

cml runner \
              --cloud=gcp \
              --cloud-region=us-west \
              --cloud-type=m+k80 \
              --labels=cml-gpu \
              --cloud-permission-set=cmldeploy@bp-padang.iam.gserviceaccount.com

We then mount buckets in our project using gcsfuse:

gcsfuse --debug_gcs --implicit-dirs data/

And this returns the following error:

2022/02/15 19:03:17.273266 Start gcsfuse/0.40.0 (Go version go1.17.6) for app "" using mount point: /__w/ml-project-seed/ml-project-seed/data
2022/02/15 19:03:17.287982 Opening GCS connection...
2022/02/15 19:03:17.291621 Mounting file system "gcsfuse"...
2022/02/15 19:03:17.293180 File system has been successfully mounted.
Here are the contents of the mounted path
$ cd data/bp-padang/cloudcover
/__w/_temp/0c805963-6ad9-44af-931d-9f971accd261.sh: 24: cd: can't cd to data/bp-padang/cloudcover

The service account cmldeploy@bp-padang.iam.gserviceaccount.com has been assigned Storage Admin and Compute Admin roles, so theoretically it should have access to the buckets.

After multiple trial and errors, we were able to setup an instance via terraform and successfully mount the buckets with gcsfuse by using these settings:

resource "google_compute_instance" "jupyter" {
  ....
  service_account { scopes = ["storage-full", "cloud-platform"] }
  ...
}

Looks like the scopes are quite important in order to provide instances with permissions in GCS resources. It would be great if we could set those along with other parameters in the cml runner command.

If you have any other experiences mounting GCS buckets in CML based runners, would be happy to hear how you accomplished it without the access scopes. Any help would be really appreciated!

@dacbd
Copy link
Contributor

dacbd commented Feb 16, 2022

@rodrigoalmeida94 this is helpful, as I have encountered a similar issue, and originally implemented the feature 🙈
I plan to take a deeper look into this soon and this gives me some ideas for a solution!

@rodrigoalmeida94
Copy link
Author

Amazing @dacbd ! Do let me know if I can help in any way.

@dacbd
Copy link
Contributor

dacbd commented Feb 16, 2022

Amazing @dacbd ! Do let me know if I can help in any way.

@rodrigoalmeida94 if you are on their discord server you can dm (dabarnes) there if you want to help test out the fix, (I'm almost done) we can make sure it works you as well, I can walk you through trying it out if you have go installed.

@rodrigoalmeida94
Copy link
Author

I see you've merge the branch @dacbd ! Thanks a lot for that 🥳 - from your comment it seems like you have tested the fix already?
For me to be able to use this fix with the GitHub action you'll need to make a release of the provider followed by a release of CML correct?

@dacbd
Copy link
Contributor

dacbd commented Feb 17, 2022

We will still need to wait for a release. I'm going to make sure cml parses the new arg format fine and probably update the cli description, but no release/update of cml should be required

@DavidGOrtega
Copy link
Contributor

@dacbd @rodrigoalmeida94 is released in v0.9.14

@dacbd
Copy link
Contributor

dacbd commented Feb 17, 2022

Thanks @DavidGOrtega, and I can confirm that I had a whole pipeline work successfully with:

          cml-runner \
            ...
            --cloud=gcp \
            --cloud-permission-set=dvc-object-storage@xxx.iam.gserviceaccount.com,scopes=storage-rw \
            ...

And DVC on the instance automagically had correct permission to the remote bucket 🥳 🎈

@rodrigoalmeida94
Copy link
Author

I can also confirm I was able to run a workflow successfully when mounting a bucket using gcsfuse! Thanks so much everyone for the really fast turn around. 🐎 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants