Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for VLLM models? SmolVLM, Florence-2, etc #1576

Open
cchadowitz opened this issue Feb 7, 2025 · 2 comments
Open

Support for VLLM models? SmolVLM, Florence-2, etc #1576

cchadowitz opened this issue Feb 7, 2025 · 2 comments
Assignees

Comments

@cchadowitz
Copy link
Collaborator

I reached out on Gitter but I'm not sure if that's still actively used:

With all the LLM and VLM models announced and released recently, are there thoughts or plans around supporting those types of models in DeepDetect? I'm specifically most interested in multimodal models like the new HuggingFace SmolVLM Instruct models and Microsoft Florence-2 vision models, both based on the HuggingFace Transformers library.
SmolVLM is available as a variety of ONNX models, while Florence-2 is available as pytorch .bin, both with their own set of various configs for the model, tokenizers, preprocessors, etc, so I'm not sure how much (or which) is viable for use in DeepDetect.

@beniz
Copy link
Collaborator

beniz commented Feb 7, 2025

Hi @cchadowitz we have another software stack for LLMs and VLLMs (including smolvlm), it's not been open sourced yet. If you'd like to have them inside DD, please PM us via email to see what can be done.

Sorry about the gitter, we've been busy elsewhere.

@beniz beniz self-assigned this Feb 7, 2025
@cchadowitz
Copy link
Collaborator Author

No problem re: gitter, just wasn't sure if it was the right place still :)

I'll reach out about the LLMs/VLLMs stuff over email, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants