Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve how artifacts are stored locally to avoid duplicating data #75

Closed
amisevsk opened this issue Mar 7, 2024 · 1 comment · Fixed by #390
Closed

Improve how artifacts are stored locally to avoid duplicating data #75

amisevsk opened this issue Mar 7, 2024 · 1 comment · Fixed by #390
Assignees
Labels
CLI Topics related to the CLI *Salmon not pink enhancement New feature or request

Comments

@amisevsk
Copy link
Contributor

amisevsk commented Mar 7, 2024

Describe the problem your feature would solve

Currently, ModelKits are stored using one OCI spec index per repository, using the folder structure

<storage-root>
└── <registry>
    └── <organization>
        ├── <repository1>
        │   ├── blobs
        │   ├── index.json
        │   └── oci-layout
        └── <repository2>
            ├── blobs
            ├── index.json
            └── oci-layout

As the OCI image index spec does not leave easy room for multiple repositories within one index, tagging the same image into two separate repositories currently uses double the storage. In other words, executing

kit tag my-image:mytag my-other-image:mytag

results in the blobs for my-image being copied to another directory.

Note this issue isn't present for ModelKits within the same repository -- i.e. my-image:tag1 and my-image:tag2 will share storage as expected.

Describe the solution you'd like

Since blobs are content-addressable and there are no auth concerns with locally-stored modelkits, it makes sense to store each blob only once, and reference them from multiple different indexes. This would cut down on storage requirements for ModelKits while keeping a relatively pure OCI image index structure.

Describe alternatives you've considered

Alternatively, we could abandon using the image index structure for local storage and instead implement an alternate way of tracking references to ModelKits in local storage. This would avoid the need for potentially awkward workarounds to manage accessing and removing blobs locally.

Additional context

@bmicklea bmicklea added enhancement New feature or request CLI Topics related to the CLI *Salmon not pink labels Mar 7, 2024
@bmicklea
Copy link
Contributor

bmicklea commented Apr 8, 2024

I can see the potentially significant storage benefits to implementing this. Does it make Kit and ModelKits any easier to use?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI Topics related to the CLI *Salmon not pink enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants