You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use ONNX as format, which is supported by langchain4j. The other is GGUF and safetensors-format. The market is mostly safetensors-format. But no Java library. We did not want to have a third party server running.
For the embedding model, we opted for deep java library, which uses ONNX under the hood, because a) supported by Java eco system and b) can be converted from other formats. WE accept that the models is much smaller and faster than saftetensors, but possibly responds with slightly less quality.
The text was updated successfully, but these errors were encountered:
I found out that DJL supports multiple formats.
They use this converter: https://docs.djl.ai/master/extensions/tokenizers/index.html#use-djl-huggingface-model-converter.
Apparently allows to convert a huggingface transformer model to TorchScript, Onnxruntime or Rust.
I assume with "huggingface transformer" model, they mean safetensors.
I used the script and did two conversions that worked, but the conversion failed for two other models, so there seem to be model architecture specific requirements for the conversion to work.
For the embedding model, we opted for deep java library, which uses ONNX under the hood, because a) supported by Java eco system and b) can be converted from other formats. WE accept that the models is much smaller and faster than saftetensors, but possibly responds with slightly less quality.
The text was updated successfully, but these errors were encountered: