You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, first up, thank you for building and open sourcing such a great piece of work, I have been using INSTRUCTOR for some time now and I absolutely love it.
I'm planning on working generating embeddings for a large corpus of texts (In Million scale), I intend to schedule the embedding generation job as an aysnc-MQ based execution. Based on some of my initial estimates the run-time estimates are a bit on the higher side, I was hoping certain methods could be used to optimize the generation of embeddings. Some of them include.
Inference on TensorRT
Compile the underlying PyTorch model
I see that you folks use Sentence-transformers like implementation, so I am unsure if torch compile how it would work
Using Kernel fusion / Custom kernels. etc
Are there any generally prescribed guidelines which would help me achieve these, is anyone here working on such optimizations?
The text was updated successfully, but these errors were encountered:
Yeah, INSTRUCTOR is highly similar to sentence-transformer in terms of the model architecture. Therefore, any optimization that applies to sentence-transformer models may also be applicable to the INSTRUCTOR models.
Hey, first up, thank you for building and open sourcing such a great piece of work, I have been using INSTRUCTOR for some time now and I absolutely love it.
I'm planning on working generating embeddings for a large corpus of texts (In Million scale), I intend to schedule the embedding generation job as an aysnc-MQ based execution. Based on some of my initial estimates the run-time estimates are a bit on the higher side, I was hoping certain methods could be used to optimize the generation of embeddings. Some of them include.
Are there any generally prescribed guidelines which would help me achieve these, is anyone here working on such optimizations?
The text was updated successfully, but these errors were encountered: