You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Once there is support in HF transformers, then it should be relatively straightforward to port into vLLM. It seems there aren't any efforts from searching the transformers issues/PRs
🚀 The feature, motivation and pitch
Benchmarks on the Nvidia's latest nemotron model look great. Is there plan or already going work to support it?
Alternatives
No response
Additional context
https://huggingface.co/nvidia/Nemotron-4-340B-Instruct
The text was updated successfully, but these errors were encountered: