Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Add dtype cast for modules other than Transformer #2889

Merged
merged 5 commits into from
Sep 10, 2024

Conversation

ir2718
Copy link
Contributor

@ir2718 ir2718 commented Aug 14, 2024

The PR adds the torch_dtype cast for modules other than Transformer when loading the SentenceTransformer. (#2887)

@ahmedkooli
Copy link

Bug reference link :)

@ahmedkooli
Copy link

@ir2718 I'm seeing the PR not passing the tests. I think you should change the code to:

if model_kwargs is not None:
    if "torch_dtype" in model_kwargs:
        module = module.to(model_kwargs["torch_dtype"])

@ir2718
Copy link
Contributor Author

ir2718 commented Aug 16, 2024

@ahmedkooli

Thanks, for some reason I thought an empty dict was the default.

@shizhediao shizhediao mentioned this pull request Sep 9, 2024
This should be a tad safer. The original fix failed for torch_dtype = "auto", "float16", "bfloat16", and would not be receptive to models automatically loaded in fp16 via a custom Module
@tomaarsen
Copy link
Collaborator

Heya @ir2718!
Thanks for creating a PR here - I've adopted it with a bit of a different fix, as I noticed some edge cases didn't work:

  • {"torch_dtype": "auto"}
  • {"torch_dtype": "float16"}
  • {"torch_dtype": "bfloat16"}

I've also added a test here. Ideally this should fix #2887

  • Tom Aarsen

@tomaarsen tomaarsen changed the title Add dtype cast for modules other than Transformer [fix] Add dtype cast for modules other than Transformer Sep 10, 2024
@tomaarsen tomaarsen merged commit 597d5ed into UKPLab:master Sep 10, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Encoding with model in float16 leads to "mat1 and mat2 must have the same dtype" error
3 participants