Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: GCP #680

Closed
3 of 4 tasks
casperdcl opened this issue Jul 28, 2021 · 6 comments · Fixed by #997
Closed
3 of 4 tasks

tests: GCP #680

casperdcl opened this issue Jul 28, 2021 · 6 comments · Fixed by #997
Assignees
Labels
blocked Dependent on something else cml-runner Subcommand documentation Markdown files p1-important High priority testing Unit tests & debugging

Comments

@casperdcl
Copy link
Contributor

casperdcl commented Jul 28, 2021

Due to iterative/terraform-provider-iterative#156, GCP should be supported.

@casperdcl
Copy link
Contributor Author

casperdcl commented Aug 17, 2021

@casperdcl casperdcl self-assigned this Aug 17, 2021
@casperdcl casperdcl added the p0-critical Max priority (ASAP) label Aug 17, 2021
@0x2b3bfa0
Copy link
Member

0x2b3bfa0 commented Aug 17, 2021

Stub, from Notion

Prerequisites

  1. Create a new Google Cloud project (official documentation)
  2. Create a new service account for the newly created project (official documentation)
  3. Create a new service account key for the newly created service account (official documentation)
  4. Store the contents of the downloaded JSON key as a GitHub repository secret named GOOGLE_APPLICATION_CREDENTIALS_DATA

My failed blog post has some extra guidance for GitHub and GitLab and best practices for secret handling in CI/CD environments:

GitHub

  1. Add these two masked variables to your project:

    🔒 You can also store these values as external secrets instead of variables if your server is configured to support this feature

GitLab

  1. Add these two secrets to your repository:
    • REPO_TOKEN with a Personal Access Token with enough permissions for registering the self-hosted runner and publishing a comment with the results
    • ···

Usage

on: workflow_dispatch
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: iterative/cml@v1
      - run: >-
          cml-runner
          --cloud=gcp 
          --cloud-region=us-west1-b
          --cloud-type=custom-8-65536-ext
          --cloud-gpu=v100
        env:
          REPO_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
          GOOGLE_APPLICATION_CREDENTIALS_DATA: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS_DATA }}
  deploy:
      runs-on: self-hosted
      steps:
        - run: nvidia-smi

The --region option specifies what Google Cloud calls zones, not regions like in other cloud vendors.

You can check this list to determine which zones provide GPU accelerators and which models are available. Not every zone has availability for every GPU model.

Custom machine types can be specified with custom-{cores}-{memory} where {cores} represents the number of CPU cores and {memory} represents the RAM memory in megabytes; appending the -ext suffix will also enable extended memory.

GPU accelerators are only available on N1 and A2 machines. Trying to request accelerators in any other machine will produce an error.

Authentication

You can set either the GOOGLE_APPLICATION_CREDENTIALS_DATA environment variable to the contents of a service account JSON file, or the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the mentioned file.

The former is more convenient for CI/CD scenarios, where secrets are (usually) provisioned through environment variables instead of files.

@0x2b3bfa0 0x2b3bfa0 changed the title offical GCP support official GCP support Sep 22, 2021
@casperdcl
Copy link
Contributor Author

/tests

@casperdcl casperdcl removed their assignment Oct 4, 2021
@casperdcl casperdcl added the blocked Dependent on something else label Oct 4, 2021
@0x2b3bfa0
Copy link
Member

/tests

Hwat? [sic]

@DavidGOrtega
Copy link
Contributor

@casperdcl casperdcl changed the title official GCP support tests: GCP Mar 18, 2022
@casperdcl casperdcl added p1-important High priority and removed p0-critical Max priority (ASAP) labels Mar 18, 2022
@0x2b3bfa0 0x2b3bfa0 self-assigned this Mar 21, 2022
@casperdcl casperdcl linked a pull request Mar 22, 2022 that will close this issue
@0x2b3bfa0
Copy link
Member

#680 (comment) contains some valuable bits and pieces we don't have anywhere else. 🤔 Should they be promoted to cml.dev/doc?

@0x2b3bfa0 0x2b3bfa0 linked a pull request May 4, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Dependent on something else cml-runner Subcommand documentation Markdown files p1-important High priority testing Unit tests & debugging
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants