Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General pattern for weight download and conversation #16

Open
antimora opened this issue Feb 12, 2024 · 3 comments
Open

General pattern for weight download and conversation #16

antimora opened this issue Feb 12, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@antimora
Copy link
Collaborator

This ticket is a two fold request:

  1. Further enhancing Resnet-Burn model, which was recently added.
  2. Come up with a general requirement and solution to the models added to models repo.

Now that we are adding popular models to the burn-model repo, we should consider the end user experience and come up with some basis top level requirements of what is expected when a user adopts/uses migrated model. This can evolve into a standard across other modes.

Here is my proposal:

  1. Each model should offer an automatic weights download from a known source. The source can be overwritten if needed. We should offer in a library form and binary executable under bin folder. The destination can be defaulted to some cache location or specified by a user.
  2. If the source file is non-burn format, we convert the file and the subsequent loading uses burn native file.
  3. (Optional) Converted file is uploaded to HuggingFace portal under Burn organization.
@antimora antimora added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 12, 2024
@antimora
Copy link
Collaborator Author

I am inviting @laggui @ashdtu @nathanielsimard @louisfd @Luni-4, and others for your inputs.

@laggui
Copy link
Member

laggui commented Feb 12, 2024

Funny you mention that, I was just working on adding automatic loading of pre-trained weights to the ResNet models 😄 So great timing!

Since I haven't pushed any of my changes yet (PR should come soon), I'll summarize the way I am currently approaching this.

By default, the models support no_std and I've added a pretrained feature flag that requires std and adds optional dependencies such as burn-import crate to use the PyTorchFileRecorder and burn/network (new since this PR) to use the download_file_as_bytes function with a download progress bar.

Regarding your specific points:

  1. For storing the downloaded weights, right now I followed the default pattern I observed in burn: put them in the ~/.cache directory under the model name (e.g., ~/.cache/resnet-burn).
  2. For loading the weights I currently added resnet*_pretrained methods that do exactly as you described: download the .pth checkpoint and use the PyTorchFileRecorder to load them.
  3. Haven't done anything in that regard yet, but we briefly talked about something like that with @nathanielsimard

@nathanielsimard
Copy link
Member

Something that I would also like to see is exporting models without a specified backend. So users can chose the backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants