Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite paths in a manifest #130

Closed
TomNicholas opened this issue Jun 4, 2024 · 0 comments · Fixed by #152
Closed

Rewrite paths in a manifest #130

TomNicholas opened this issue Jun 4, 2024 · 0 comments · Fixed by #152
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@TomNicholas
Copy link
Member

TomNicholas commented Jun 4, 2024

One minor feature that might be useful would be a convenience method for rewriting the string paths in a manifest. The use case is if you move or rename underlying files, you don't want to have to regenerate all the byte ranges when you could just use the same ones and edit the paths.

This is therefore related to #118, as you could open previously-written kerchunk references, change the paths, and then re-save them, without having to find byte ranges again.

I'm imagining adding API something like this:

class Manifest:
    ...

    def rename_paths(
        new: str | Callable[str, str],
    ) -> Manifest:
        """
        Rename paths to chunks in this manifest.

        Accepts either a string, in which case this new path will be used for all chunks, or 
        a function which accepts the old path and returns the new path.

        Parameters
        ----------
        new
            New path to use for all chunks, either as a string, or as a function which accepts and returns strings.

        Returns
        -------
        manifest

        Examples
        --------
        Rename paths to reflect moving the referenced files from local storage to an S3 bucket.

        >>> def local_to_s3_url(old_local_path: str) -> str:
        ...     from pathlib import Path
        ...
        ...     new_s3_bucket_url = "http://s3.amazonaws.com/my_bucket/"
        ...
        ...     filename = Path(old_local_path).name
        ...     return str(new_s3_bucket_url / filename)

        >>> manifest.rename_paths(local_to_s3_url)
        """
        ...

This method would be implemented on Manifest, but also present on ManifestArray and on the VirtualiZarrDatasetAccessor.

The option to set all chunks to have the same path might not be particularly useful, though perhaps more so if we support indexing (#51).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant