Skip to content

huggingface[major]: Reduce disk footprint by 95% by making large dependencies optional #31268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Simon-Stone
Copy link
Contributor

Description:
langchain_huggingface has a very large installation size of around 600 MB (on a Mac with Python 3.11). This is due to its dependency on sentence-transformers, which in turn depends on torch, which is 320 MB all by itself. Similarly, the depedency on transformers adds another set of heavy dependencies. With those dependencies removed, the installation of langchain_huggingface only takes up ~26 MB. This is only 5 % of the full installation!

These libraries are not necessary to use langchain_huggingface's API wrapper classes, only for local inferences/embeddings. All import statements for those two libraries already have import guards in place (try/catch with a helpful "please install x" message).

This PR therefore moves those two libraries to an optional dependency group full. So a pip install langchain_huggingface will only install the lightweight version, and a pip install "langchain_huggingface[full]" will install all dependencies.

I know this may break existing code, because sentence-transformers and transformers are now no longer installed by default. Given that users will see helpful error messages when that happens, and the major impact of this small change, I hope that you will still consider this PR.

Dependencies: No new dependencies, but new optional grouping.

Copy link

vercel bot commented May 17, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview May 17, 2025 10:47pm

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. langchain Related to the langchain package labels May 17, 2025
@Simon-Stone
Copy link
Contributor Author

@ccurme Is this change something you might consider?

@Simon-Stone
Copy link
Contributor Author

Tagging for visibility:

@baskaryan, @eyurtsev, @ccurme, @vbarda, @hwchase17

I think this could be a major improvement for anyone who is using langchain_huggingface with remote endpoints only.

@eyurtsev
Copy link
Collaborator

eyurtsev commented Jun 2, 2025

Yes. I think this makes sense. We usually wrap optional dependencies with a try except and re-raise with explicit instructions to the end user (e.g., pip install ...)

If you're able to make this change, we can release the integration.

We'll release with a properly updated version to mark that this is a breaking change

@eyurtsev eyurtsev self-assigned this Jun 2, 2025
@eyurtsev eyurtsev changed the title huggingface: Reduce disk footprint by 95% by making large dependencies optional huggingface[major]: Reduce disk footprint by 95% by making large dependencies optional Jun 2, 2025
@Simon-Stone
Copy link
Contributor Author

I believe those guards are already in place everywhere they are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
langchain Related to the langchain package size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants