Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ADR for langchain4j + djl #185

Open
koppor opened this issue Sep 4, 2024 · 2 comments
Open

Add ADR for langchain4j + djl #185

koppor opened this issue Sep 4, 2024 · 2 comments

Comments

@koppor
Copy link
Collaborator

koppor commented Sep 4, 2024

  • We use ONNX as format, which is supported by langchain4j. The other is GGUF and safetensors-format. The market is mostly safetensors-format. But no Java library. We did not want to have a third party server running.
  • Microsoft Semantic Kernel (API) is an alternative for langchain4j. Also uses ONNX (probably)

For the embedding model, we opted for deep java library, which uses ONNX under the hood, because a) supported by Java eco system and b) can be converted from other formats. WE accept that the models is much smaller and faster than saftetensors, but possibly responds with slightly less quality.

@ThiloteE
Copy link
Collaborator

ThiloteE commented Nov 15, 2024

I found out that DJL supports multiple formats.
They use this converter: https://docs.djl.ai/master/extensions/tokenizers/index.html#use-djl-huggingface-model-converter.
Apparently allows to convert a huggingface transformer model to TorchScript, Onnxruntime or Rust.
I assume with "huggingface transformer" model, they mean safetensors.
I used the script and did two conversions that worked, but the conversion failed for two other models, so there seem to be model architecture specific requirements for the conversion to work.

I haven't managed yet to find out how to add those files to the model zoo. Their documentation is hard to understand. See also https://docs.djl.ai/master/docs/development/add_model_to_model-zoo.html

@InAnYan
Copy link
Owner

InAnYan commented Nov 26, 2024

There is an ADR for this: https://github.com/JabRef/jabref/blob/main/docs/decisions/0037-rag-architecture-implementation.md.

I explained there why langchain and djl were used at all. And why djl instead of langchain embedding models.

Is this what you mean?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants