Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for specifying VM Image for Cloud Batch backend #300

Open
JackAtScaleSec opened this issue Dec 18, 2024 · 4 comments
Open

Support for specifying VM Image for Cloud Batch backend #300

JackAtScaleSec opened this issue Dec 18, 2024 · 4 comments

Comments

@JackAtScaleSec
Copy link

Certain organizations utilizing dsub may require the ability to specify a custom VM image for compliance, regulatory, or internal requirements. The underlying Cloud Batch API supports specifying an image when creating a job.

I'm requesting adding support for specifying a custom VM image when creating a dsub job using the Google Cloud Batch backend.

FYI - I requested access to the Contributor License Agreement as referenced in CONTRIBUTING.md but still haven't been granted access to the file.

@wnojopra
Copy link
Contributor

wnojopra commented Jan 7, 2025

Hi @JackAtScaleSec ! Thanks for writing in.

So for the Google Cloud Batch API, dsub does support the --image parameter. What gets passed here is passed throuugh to the Google Cloud Batch API Container in the imageUri field. The dsub code that does that is here.

Is that sufficient for your needs or is there more to the issue?

@JackAtScaleSec
Copy link
Author

Thanks for getting back to me @wnojopra - there's more to the issue.

Beyond specifying a custom container image, which is a necessity for my use case, we also need to specify a custom OS image for the GCE instance the container is ran on, due to security and compliance requirements.

This will allow us to ensure the underlying GCE instance is using a custom OS image that has had security hardening applied, agents installed, etc. as needed.

@wnojopra
Copy link
Contributor

wnojopra commented Jan 9, 2025

Thanks for the clarity @JackAtScaleSec . I believe I understand your use case, but I'm not sure if the Cloud Batch API supports this. In your original comment, you mention

The underlying Cloud Batch API supports specifying an image when creating a job.

Am I missing this somewhere? I'm primarily looking at the Batch Job spec. There's an imageUri field in the Containers object, which dsub currently uses for the custom container image. There's also an image field in the Disk object. dsub doesn't yet support this for the google-batch API, but plans to shortly. However, I think the use of this disk image is for faster retrieval of large input files and docker images. I'm not sure this will achieve the custom OS image you're looking for.

@JackAtScaleSec
Copy link
Author

@wnojopra I believe it will support this. With the batch API, you can specify an instance policy (specific items like machine type and boot disk) or an instance template itself, which would also include the boot disk to use for the GCE instances.

See https://cloud.google.com/python/docs/reference/batch/latest/google.cloud.batch_v1alpha.types.AllocationPolicy.InstancePolicy > boot_disk

Also, here's Google's docs page covering the capability to specify a custom OS image via this mechanism: https://cloud.google.com/batch/docs/specify-vm-os-image#api. Code snippet:

    "allocationPolicy": {
      "instances": [
        {
          "policy": {
            "bootDisk": {
              "image": "VM_OS_IMAGE_URI"
            }
          }
        }
      ]
    },

Thanks,
Jack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants