Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting langchaingo into sub-modules #369

Open
eliben opened this issue Nov 23, 2023 · 9 comments
Open

Splitting langchaingo into sub-modules #369

eliben opened this issue Nov 23, 2023 · 9 comments

Comments

@eliben
Copy link
Collaborator

eliben commented Nov 23, 2023

Currently langchaingo is a single Go module (modulo the examples, which live in separate modules per example - consistently after #367 lands).

This means that users who are only interested in using langchaingo with a single backend LLM (say, OpenAI), will see their go commands pull in many dependencies they don't necessarily need. Even though the go build tool won't include them in the users' programs, this may have negative effects for - say - CI time. But it's important to hear what specific concerns users are having.

Splitting the repo into multiple modules is natural on the LLM provider level; e.g. langchaingo can be its own (top-level) module, and each LLM provider another module nested within the repo, importing the main langchaingo module for the common stuff. This way users can only import the modules they need and their dependencies.

This isn't free of cost, however. Managing multiple modules in a repo requires some care, especially around releases. In a way it's like each module living in its own separate repo, except that in a single repo they can share tooling and scripts. Also, with the existence of go workspaces, a go.work file can make local development much more pleasant.

One pre-requisite for this would be to start tagging actual releases of langchaingo. We can start with 0.1.0 and follow semver to increase the minor version as much as we need.

@eliben
Copy link
Collaborator Author

eliben commented Nov 23, 2023

I just tested this by copying examples/openai-completion-example into a standalone directory with a new module, and ran go mod tidy to see what is pulled in:

$ go mod tidy
go: finding module for package github.com/tmc/langchaingo/llms/openai
go: finding module for package github.com/tmc/langchaingo/llms
go: found github.com/tmc/langchaingo/llms in github.com/tmc/langchaingo v0.0.0-20231122191601-2eb6f5408849
go: found github.com/tmc/langchaingo/llms/openai in github.com/tmc/langchaingo v0.0.0-20231122191601-2eb6f5408849

It's not that bad, actually! The Go tool is good about only pulling whichever dependencies are needed.

@tmc
Copy link
Owner

tmc commented Nov 28, 2023

Is it my imagination or did it not used to be as smart with unused transitive dependencies?

@eliben
Copy link
Collaborator Author

eliben commented Nov 28, 2023

Yes, this is module pruning which was shipped in Go 1.17 -- see https://go.dev/ref/mod#graph-pruning (there's a link there to a design doc with more details if you're interested)

@tmc
Copy link
Owner

tmc commented Nov 30, 2023

I'd still like to find ways to keep the number of dependencies low and want to explore this more -- we could perhaps analyze what is bringing in the most and consider more targeted submodules there.

@tmc tmc mentioned this issue Nov 30, 2023
9 tasks
@tmc
Copy link
Owner

tmc commented Dec 20, 2023

I'm experimenting with what having modules under ./contrib would look like here: https://github.com/tmc/langchaingo/tree/add-contrib

@eliben
Copy link
Collaborator Author

eliben commented Dec 29, 2023

I think the add-contrib branch looks alright if you want to go in this direction.

Just like with the examples, the dependency has to be always one way - the contrib modules depend on the main langchaingo module, not the other way around. And dependency management will have to be done similarly with bumping main module versions in all the go.mod when the main module has a new release. This creates an issue when you want to add a feature to the main module and immediately use it in a contrib module because there's no new tag yet; it can be tested with a go.work file locally, but CI may be unhappy on the PR.

@tmc
Copy link
Owner

tmc commented Dec 29, 2023 via email

@eliben
Copy link
Collaborator Author

eliben commented Dec 29, 2023

It's possible, but I don't think it makes sense to over-complicate this, honestly.

Most changes don't need to affect all modules at the same time; since go.mod versions don't get auto-bumped, things should work in between. E.g. if we make a change in main module vN, then want to use this change in a sub-module, we can have one commit updating the main module, then tag vN, then another commit updating the go.mod line in a contrib module to use vN and the code change.

Single commit/PR affecting both main and contrib modules should hopefully be rare.

I would recommend waiting with automation to see what actual issues we encounter in this repository, since each project is different in this respect.

@devinyf
Copy link
Contributor

devinyf commented Feb 17, 2024

Just wondering... Is it a good idea to use go-plugins to manage third-party modules like llms, tools ??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants