Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

feat: support loading models from hf #751

Merged
merged 7 commits into from
Jun 8, 2023
Merged

Conversation

guenthermi
Copy link
Member

@guenthermi guenthermi commented Jun 7, 2023

feat: support loading models from hf

This PR allows one to load models from https://huggingface.co/jinaai:

import finetuner
import numpy as np

model = finetuner.get_model('jinaai/ecommerce-sbert-model')
e1, e2 = finetuner.encode(model, ['XBox', 'Xbox One Console 500GB - Black (2015)'])

print(f'Similarity: {np.dot(e1,e2)/(np.linalg.norm(e1)*np.linalg.norm(e2))}')
Similarity: 0.7842968702316284
  • This PR references an open issue
  • I have added a line about this change to CHANGELOG

@guenthermi guenthermi marked this pull request as ready for review June 7, 2023 13:11
@guenthermi guenthermi requested review from gmastrapas, bwanglzu and LMMilliken and removed request for gmastrapas and bwanglzu June 7, 2023 13:11
@github-actions github-actions bot added the area/testing This issue/PR affects testing label Jun 7, 2023
Comment on lines 15 to 17
def test_download_hf_model(model):
path = download_huggingface_model(model)
assert os.path.isdir(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also check that we can successfully build the model from the contents of this path?

Comment on lines +97 to +98
HF_URL_PREFIX = 'https://huggingface.co/jinaai/'
HF_ORG_PREFIX = 'jinaai/'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we have the push function, let's remove the org restriction

Copy link
Member

@bwanglzu bwanglzu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

finetuner/__init__.py Outdated Show resolved Hide resolved
@@ -517,6 +519,29 @@ def build_model(
)


def download_huggingface_model(model_name: str, token: Optional[str] = None) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this meant to be public? if not, use an _ in front of the name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure, maybe private is better

:param token: Optional token to access private models.
:return: The local path to the downloaded model.
"""
from huggingface_hub import hf_hub_download, list_repo_files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this requirement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have transformers already as a dependency which depends on it. Should we add it anyway?

guenthermi and others added 2 commits June 8, 2023 09:22
Co-authored-by: George Mastrapas <32414777+gmastrapas@users.noreply.github.com>
Copy link
Contributor

@LMMilliken LMMilliken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@guenthermi guenthermi merged commit 32c7978 into main Jun 8, 2023
@guenthermi guenthermi deleted the feat-support-load-from-hf branch June 8, 2023 08:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants