Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for cloud uri (GCS) as input to the pipeline #42

Merged
merged 12 commits into from
Aug 25, 2024
Merged

Allow for cloud uri (GCS) as input to the pipeline #42

merged 12 commits into from
Aug 25, 2024

Conversation

akhanf
Copy link
Member

@akhanf akhanf commented Aug 21, 2024

This adds support for inputs coming from a GCS bucket. It copies the entire dataset locally (as temp()) for using it, and requires the gcloud CLI to be installed, as copying with gcloud storage cp was far faster than whatever the snakemake gcs remote plugin was doing.

This is mainly used for execution on coiled, for which wrapper (being developed in a separate repo) makes use of.

Note: does not support tar files in the cloud, only folders containing the tif files.

@akhanf akhanf added the enhancement New feature or request label Aug 21, 2024
akhanf added 4 commits August 23, 2024 09:10
* changes to avoid copy from gcs
will read tif files directly from cloud, both for metadata and for dask
array creation.
remove print debug statements
@akhanf akhanf merged commit 923aa84 into main Aug 25, 2024
4 checks passed
@akhanf akhanf deleted the cloudinput branch August 25, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant