v0.9.0
What's new in 0.9.0 (2024-02-22)
These are the changes in inference v0.9.0.
New features
- FEAT: Refactor device related code and add initial Intel GPU support by @notsyncing in #968
- FEAT: Support gemma series model by @aresnow1 in #1024
Enhancements
- ENH: [UI] Supports
replica
when launching LLM models by @ChengjieLi28 in #1011 - ENH: [UI] Show cluster resource information by @ChengjieLi28 in #1015
Bug fixes
- BUG: fix chat completion error when indexing body.messages by @fffonion in #1008
- BUG: Fix cache sd 1.5 error by @codingl2k1 in #1013
- BUG: fix typo in modelscope llama-2-13b-chat-GGUF by @qinxuye in #1026
- BUG: Fix missing qwen 1.5 7b gguf by @codingl2k1 in #1027
Documentation
- DOC: Polish model operation command doc by @onesuper in #1000
- DOC: Fix note on secret_key generation and algorithm selection for OAuth2 by @ChengjieLi28 in #1012
New Contributors
- @fffonion made their first contribution in #1008
- @notsyncing made their first contribution in #968
Full Changelog: v0.8.5...v0.9.0