Releases: xorbitsai/inference
Releases · xorbitsai/inference
v0.8.0
What's new in 0.8.0 (2024-01-11)
These are the changes in inference v0.8.0.
New features
- FEAT: qwen 1.8b gptq by @codingl2k1 in #869
- FEAT: docker compose support by @Minamiyama in #868
- FEAT: Simple OAuth2 system by @ChengjieLi28 in #793
- FEAT: Chat vl web UI by @codingl2k1 in #882
- FEAT: Yi chat gptq by @codingl2k1 in #876
Enhancements
- ENH: Stream use xoscar generator by @codingl2k1 in #859
- ENH: UI supports registering custom
gptq
models by @ChengjieLi28 in #875 - ENH: make the size param of *_to_image more compatible by @liunux4odoo in #881
- BLD: Update package-lock.json by @aresnow1 in #886
- REF: Add
model_hub
property inEmbeddingModelSpec
by @aresnow1 in #877
Bug fixes
- BUG: Fix image model b64_json output by @codingl2k1 in #874
- BUG: fix libcuda.so.1: cannot open shared object file by @superhuahua in #883
- BUG: Fix auto recover kwargs by @codingl2k1 in #885
Documentation
- DOC: docker image translation by @aresnow1 in #865
- DOC: register model with
model_family
by @ChengjieLi28 in #863 - DOC: Add OpenAI Client API doc by @codingl2k1 in #864
- DOC: add docker instructions by @onesuper in #878
New Contributors
- @superhuahua made their first contribution in #883
Full Changelog: v0.7.5...v0.8.0
v0.7.5
What's new in 0.7.5 (2024-01-05)
These are the changes in inference v0.7.5.
New features
- FEAT: text2vec by @ChengjieLi28 in #857
Enhancements
- ENH: Offload all response serialization to ModelActor by @codingl2k1 in #837
- ENH: Custom model uses vLLM by @ChengjieLi28 in #861
- BLD: Docker image by @ChengjieLi28 in #855
Bug fixes
- BUG: Fix typing_extension version problem in notebook by @onesuper in #856
- BUG: Fix multimodal cmdline by @codingl2k1 in #850
- BUG: Fix generate of chatglm3 by @aresnow1 in #858
Documentation
- DOC: CUDA Version recommendation by @ChengjieLi28 in #841
- DOC: new doc cover by @onesuper in #843
- DOC: Autogen modelhub info by @onesuper in #845
- DOC: Add multimodal feature in README by @onesuper in #846
- DOC: Chinese doc for user guide by @aresnow1 in #847
- DOC: add notebook for quickstart by @onesuper in #854
- DOC: Add docs about environments by @aresnow1 in #853
- DOC: Add jupyter notebook quick start tutorial by @onesuper in #851
Others
- CHORE: Add docker image with
latest
tag by @ChengjieLi28 in #862
Full Changelog: v0.7.4.1...v0.7.5
v0.7.4.1
What's new in 0.7.4.1 (2023-12-29)
These are the changes in inference v0.7.4.1.
Documentation
- DOC: Multimodal example by @codingl2k1 in #842
Full Changelog: v0.7.4...v0.7.4.1
v0.7.4
What's new in 0.7.4 (2023-12-29)
These are the changes in inference v0.7.4.
New features
- FEAT: Support sd-turbo by @codingl2k1 in #797
- FEAT: Support Skywork models by @Minamiyama in #809
- FEAT: Support sdxl-turbo by @codingl2k1 in #816
- FEAT: Supports registering rerank models by @ChengjieLi28 in #825
- FEAT: Support Phi-2 by @Bojun-Feng in #828
- FEAT: Support vllm gptq by @codingl2k1 in #832
- FEAT: Support qwen vl chat by @codingl2k1 in #829
Enhancements
- ENH: Custom model can use tool calls by @codingl2k1 in #818
- ENH: Replace uuid with model name for
model_uid
by @ChengjieLi28 in #831
Bug fixes
- BUG: error when check model_uid & model_name in restful_api by @liunux4odoo in #803
- BUG: launch method exception (#807) by @auxpd in #808
- BUG: Model description does not support Chinese from UI registering by @ChengjieLi28 in #812
- BUG: Find correct class for customized model by @sarsmlee in #835
Documentation
- DOC: add function calling in read me by @onesuper in #804
- DOC: Chinese documents for
Logging
andModels
parts by @ChengjieLi28 in #650 - DOC: remove version limit of sphinx by @qinxuye in #820
- DOC: polish function call description by @onesuper in #821
- DOC: add switcher.json by @qinxuye in #822
- DOC: add document language switcher by @qinxuye in #823
- DOC: use wechat and zhihu for zh doc by @qinxuye in #824
- DOC: Chinese doc for getting started by @aresnow1 in #833
- DOC: simplify entry doc by @onesuper in #826
- DOC: Chinese doc for
example
part by @ChengjieLi28 in #838
New Contributors
- @liunux4odoo made their first contribution in #803
- @auxpd made their first contribution in #808
- @sarsmlee made their first contribution in #835
Full Changelog: v0.7.3.1...v0.7.4
v0.7.3.1
What's new in 0.7.3.1 (2023-12-22)
These are the changes in inference v0.7.3.1.
Bug fixes
- BUG: Worker starts failed on windows by @ChengjieLi28 in #800
Full Changelog: v0.7.3...v0.7.3.1
v0.7.3
What's new in 0.7.3 (2023-12-22)
These are the changes in inference v0.7.3.
New features
- FEAT: Support OpenHermes 2.5 by @Bojun-Feng in #776
- FEAT: Support deepseek models by @aresnow1 in #786
- FEAT: Support tool message by @codingl2k1 in #794
- FEAT: Support Mixtral-8x7B-v0.1 models by @Bojun-Feng in #782
- FEAT: Support mistral instruct v0.2 by @aresnow1 in #796
Enhancements
- ENH: Enable streaming on Ctransformer by @Bojun-Feng in #784
- ENH: vllm backend support tool calls by @codingl2k1 in #785
- ENH: qwen switch to llama cpp by @codingl2k1 in #778
- ENH: [UI] register custom embedding model by @ChengjieLi28 in #791
Bug fixes
- BUG: UI Crash on Search when
model_format
andmodel_size
have been selected by @Bojun-Feng in #772 - BUG: When changing
XINFERENCE_HOME
env, the model files are still stored where they were. by @ChengjieLi28 in #777 - BUG: Remove the modelscope import by @aresnow1 in #788
- BUG: when terminating worker by
ctrl+C
, supervisor does not remove worker information by @ChengjieLi28 in #779 - BUG: Xinference does not release custom model name when registering failed by @ChengjieLi28 in #790
Documentation
- DOC: Update readme by @aresnow1 in #743
- DOC: Update FunctionCall.ipynb by @codingl2k1 in #773
Full Changelog: v0.7.2...v0.7.3
v0.7.2
What's new in 0.7.2 (2023-12-15)
These are the changes in inference v0.7.2.
New features
- FEAT: Supports
qwen-chat
1.8B by @ChengjieLi28 in #757 - FEAT: Support gorilla openfunctions v1 by @codingl2k1 in #760
- FEAT: qwen function call by @codingl2k1 in #763
Enhancements
- ENH: Handle tool call failed by @codingl2k1 in #767
Bug fixes
- BUG: [UI] Fix model size selection crash issue by @ChengjieLi28 in #764
Documentation
- DOC: Fix
model_uri
missing inCustom Models
by @ChengjieLi28 in #759
Full Changelog: v0.7.1...v0.7.2
v0.7.1
What's new in 0.7.1 (2023-12-12)
These are the changes in inference v0.7.1.
Enhancements
- ENH: [UI] Supports
model_uid
input when launching models by @ChengjieLi28 in #746 - ENH: Add more vllm supported models by @aresnow1 in #756
Bug fixes
- BUG: Fix
cached
tag on UI by @ChengjieLi28 in #748 - BUG: Fix stream arg for vllm backend by @aresnow1 in #758
Documentation
Others
- Bugs: Fixing issue with emote encoding in streaming chat, Fixing issue with missing pad_token for pytorch tokenizers, allowing system message as latest message in chat by @AndiMajore in #747
New Contributors
- @AndiMajore made their first contribution in #747
Full Changelog: v0.7.0...v0.7.1
v0.7.0
What's new in 0.7.0 (2023-12-08)
These are the changes in inference v0.7.0.
Enhancements
- ENH: upgrade insecure requests when necessary by @waltcow in #712
- ENH: [UI] Using tab in running models by @ChengjieLi28 in #714
- ENH: [UI] supports launching rerank models by @ChengjieLi28 in #711
- ENH: [UI] Error can be shown on web UI directly via Snackbar by @ChengjieLi28 in #721
- ENH: [UI] Supports
n_gpu
config when launching LLM models on web ui by @ChengjieLi28 in #730 - ENH: [UI]
n_gpu
default valueauto
by @ChengjieLi28 in #738 - ENH: [UI] Support unregistering custom model on web UI by @ChengjieLi28 in #735
- ENH: Auto recover model actor by @codingl2k1 in #694
- ENH: allow rerank models run with LLM models on same device by @aresnow1 in #741
Bug fixes
- BUG: Auto patch trust remote code for embedding model by @codingl2k1 in #710
- BUG: Fix vLLM backend by @codingl2k1 in #728
Others
- Update builtin model list by @onesuper in #709
- Revert "ENH: upgrade insecure requests when necessary" by @qinxuye in #716
- CHORE: Format js file and check js code style by @ChengjieLi28 in #727
New Contributors
Full Changelog: v0.6.5...v0.7.0
v0.6.5
What's new in 0.6.5 (2023-12-01)
These are the changes in inference v0.6.5.
New features
- FEAT: Support jina embedding models by @aresnow1 in #704
- FEAT: Support Yi-chat by @aresnow1 in #700
- FEAT: Support qwen 72b by @aresnow1 in #705
- FEAT: ChatGLM3 tool calls by @codingl2k1 in #701
Enhancements
- ENH: Specify actor pool port for distributed deployment by @ChengjieLi28 in #688
- ENH: Remove
xorbits
dependency by @ChengjieLi28 in #699 - ENH: User can just specify a string for prompt style when registering custom LLM models by @ChengjieLi28 in #682
- ENH: Add more models supported by vllm by @aresnow1 in #706
Bug fixes
- BUG: Fix xinference start failed if invalid custom model found by @codingl2k1 in #690
Documentation
- Doc: Fix some incorrect links in documentation by @aresnow1 in #684
- Doc: Update readme by @aresnow1 in #687
- DOC: documentation for docker and k8s by @lynnleelhl in #661
Others
New Contributors
- @lynnleelhl made their first contribution in #661
Full Changelog: v0.6.4...v0.6.5