Replies: 3 comments
-
I don't have first hand experience with ONNX, I've just had a look at how Apache OpenNLP did the same for using NER / DocCat models from Huggingface; it seems something specific for sparse retrieval can be done in a similar fashion. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Another option to consider? https://github.com/deepjavalibrary/djl |
Beta Was this translation helpful? Give feedback.
0 replies
-
We've done this with ONNX! Closing. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
ONNX has been around for a while, but last time I looked at it, it wasn't very mature. Perhaps this has changed:
https://www.reddit.com/r/LanguageTechnology/comments/pcygyp/run_transformers_models_directly_in_javascript/
For sparse retrieval models, we need to perform inference on the queries at search time. Currently, we sidestep this by using pre-encoded queries (generated from pyserini). With something like the above, perhaps it is now possible to do query encoding directly in Java.
Worth taking another look?
Beta Was this translation helpful? Give feedback.
All reactions