From 4d87ba1568ce7a126ea9dba94b6000ee212d8d2a Mon Sep 17 00:00:00 2001 From: UranusSeven <109661872+UranusSeven@users.noreply.github.com> Date: Fri, 15 Sep 2023 16:23:38 +0800 Subject: [PATCH 1/3] DOC: update README --- README.md | 8 ++++---- README_zh_CN.md | 10 +++++----- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 749db8a690..13ac38e691 100644 --- a/README.md +++ b/README.md @@ -33,8 +33,9 @@ potential of cutting-edge AI models. - Xinference dashboard: [#93](https://github.com/xorbitsai/inference/issues/93) ### New Models - Built-in support for [CodeLLama](https://github.com/facebookresearch/codellama): [#414](https://github.com/xorbitsai/inference/pull/414) [#402](https://github.com/xorbitsai/inference/pull/402) -### Tools -- LlamaIndex plugin: [#7151](https://github.com/jerryjliu/llama_index/pull/7151) +### Integrations +- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable. +- [Chatbox](https://chatboxai.app/): a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux. ## Key Features @@ -57,8 +58,7 @@ for seamless management and monitoring. allowing the seamless distribution of model inference across multiple devices or machines. 🔌 **Built-in Integration with Third-Party Libraries**: Xorbits Inference seamlessly integrates -with popular third-party libraries like [LangChain](https://python.langchain.com/docs/integrations/providers/xinference) -and [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window). +with popular third-party libraries including [LangChain](https://python.langchain.com/docs/integrations/providers/xinference), [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window), [Dify](https://docs.dify.ai/advanced/model-configuration/xinference), and [Chatbox](https://chatboxai.app/). ## Getting Started Xinference can be installed via pip from PyPI. It is highly recommended to create a new virtual diff --git a/README_zh_CN.md b/README_zh_CN.md index 274e88028a..34992a361a 100644 --- a/README_zh_CN.md +++ b/README_zh_CN.md @@ -29,8 +29,10 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布 - Xinference 仪表盘: [#93](https://github.com/xorbitsai/inference/issues/93) ### 新模型 - 内置 [CodeLLama](https://github.com/facebookresearch/codellama): [#414](https://github.com/xorbitsai/inference/pull/414) [#402](https://github.com/xorbitsai/inference/pull/402) -### 工具 -- LlamaIndex 插件: [#7151](https://github.com/jerryjliu/llama_index/pull/7151) +### 集成 +- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。 +- [Chatbox](https://chatboxai.app/): 一个支持前沿大语言模型的桌面客户端,支持 Windows,Mac,以及 Linux。 + @@ -46,9 +48,7 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布 🌐 **集群计算,分布协同**: 支持分布式部署,通过内置的资源调度器,让不同大小的模型按需调度到不同机器,充分使用集群资源。 -🔌 **开放生态,无缝对接**: 与流行的三方库无缝对接,包括 [LangChain](https://python.langchain.com/docs/integrations/providers/xinference) -and [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window)。 -让开发者能够快速构建基于 AI 的应用。 +🔌 **开放生态,无缝对接**: 与流行的三方库无缝对接,包括 [LangChain](https://python.langchain.com/docs/integrations/providers/xinference),[LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window),[Dify](https://docs.dify.ai/advanced/model-configuration/xinference),以及 [Chatbox](https://chatboxai.app/)。 ## 快速入门 Xinference 可以通过 pip 从 PyPI 安装。我们非常推荐在安装前创建一个新的虚拟环境以避免依赖冲突。 From 5f9d40e80aed00786a81114b7ddd4ea0d9b633fa Mon Sep 17 00:00:00 2001 From: UranusSeven <109661872+UranusSeven@users.noreply.github.com> Date: Fri, 15 Sep 2023 16:33:15 +0800 Subject: [PATCH 2/3] DOC: update hot topics --- README.md | 1 - README_zh_CN.md | 2 +- doc/source/index.rst | 18 ++++++++++-------- doc/source/models/builtin/llama-2-chat.rst | 6 +++--- doc/source/models/builtin/llama-2.rst | 6 +++--- 5 files changed, 17 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index 13ac38e691..4f9e1e696c 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,6 @@ potential of cutting-edge AI models. ## 🔥 Hot Topics ### Framework Enhancements - Embedding model support: [#418](https://github.com/xorbitsai/inference/pull/418) -- Custom model support: [#325](https://github.com/xorbitsai/inference/pull/325) - LoRA support: [#271](https://github.com/xorbitsai/inference/issues/271) - Multi-GPU support for PyTorch models: [#226](https://github.com/xorbitsai/inference/issues/226) - Xinference dashboard: [#93](https://github.com/xorbitsai/inference/issues/93) diff --git a/README_zh_CN.md b/README_zh_CN.md index 34992a361a..6fb1364018 100644 --- a/README_zh_CN.md +++ b/README_zh_CN.md @@ -23,7 +23,7 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布 ## 🔥 近期热点 ### 框架增强 -- 自定义模型: [#325](https://github.com/xorbitsai/inference/pull/325) +- Embedding 模型支持: [#418](https://github.com/xorbitsai/inference/pull/418) - LoRA 支持: [#271](https://github.com/xorbitsai/inference/issues/271) - PyTorch 模型多 GPU 支持: [#226](https://github.com/xorbitsai/inference/issues/226) - Xinference 仪表盘: [#93](https://github.com/xorbitsai/inference/issues/93) diff --git a/doc/source/index.rst b/doc/source/index.rst index f18cdf5eec..d1abd475ec 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -33,7 +33,9 @@ allowing the seamless distribution of model inference across multiple devices or 🔌 **Built-in Integration with Third-Party Libraries**: Xorbits Inference seamlessly integrates with popular third-party libraries like `LangChain `_ -and `LlamaIndex `_. +, `LlamaIndex `_ +, `Dify `_ +, and `Chatbox `_. 🔥 Hot Topics @@ -41,20 +43,20 @@ and `LlamaIndex `_ +- Embedding model support: `#418 `_ - LoRA support: `#271 `_ - Multi-GPU support for PyTorch models: `#226 `_ - Xinference dashboard: `#93 `_ New Models ~~~~~~~~~~ -- Built-in support for `Starcoder` in GGML: `#289 `_ -- Built-in support for `MusicGen `_: `#313 `_ -- Built-in support for `SD-XL `_: `318 `_ +- Built-in support for `CodeLLama `_: `#414 `_ `#402 `_ -Tools -~~~~~ -- LlamaIndex plugin: `7151 `_ + +Integrations +~~~~~~~~~~~~ +- `Dify `_: an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable. +- `Chatbox `_: a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux. License diff --git a/doc/source/models/builtin/llama-2-chat.rst b/doc/source/models/builtin/llama-2-chat.rst index 2eca8d85ef..ded31d4a41 100644 --- a/doc/source/models/builtin/llama-2-chat.rst +++ b/doc/source/models/builtin/llama-2-chat.rst @@ -66,7 +66,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 5 (pytorch, 13 Billion) @@ -84,7 +84,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 6 (pytorch, 70 Billion) ++++++++++++++++++++++++++++++++++ @@ -101,4 +101,4 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. \ No newline at end of file + 4-bit quantization is not supported on macOS. \ No newline at end of file diff --git a/doc/source/models/builtin/llama-2.rst b/doc/source/models/builtin/llama-2.rst index 19614bbba4..a42090890a 100644 --- a/doc/source/models/builtin/llama-2.rst +++ b/doc/source/models/builtin/llama-2.rst @@ -65,7 +65,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 5 (pytorch, 13 Billion) ++++++++++++++++++++++++++++++++++ @@ -82,7 +82,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 6 (pytorch, 70 Billion) ++++++++++++++++++++++++++++++++++ @@ -99,4 +99,4 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. From e2d4529f22e902e208aa56dfc8e3cf5cbd686381 Mon Sep 17 00:00:00 2001 From: UranusSeven <109661872+UranusSeven@users.noreply.github.com> Date: Fri, 15 Sep 2023 16:36:39 +0800 Subject: [PATCH 3/3] Add vLLM --- README.md | 1 + README_zh_CN.md | 2 +- doc/source/index.rst | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 4f9e1e696c..47dbd5ac3f 100644 --- a/README.md +++ b/README.md @@ -26,6 +26,7 @@ potential of cutting-edge AI models. ## 🔥 Hot Topics ### Framework Enhancements +- Incorporate vLLM: [#445](https://github.com/xorbitsai/inference/pull/445) - Embedding model support: [#418](https://github.com/xorbitsai/inference/pull/418) - LoRA support: [#271](https://github.com/xorbitsai/inference/issues/271) - Multi-GPU support for PyTorch models: [#226](https://github.com/xorbitsai/inference/issues/226) diff --git a/README_zh_CN.md b/README_zh_CN.md index 6fb1364018..20f09c8007 100644 --- a/README_zh_CN.md +++ b/README_zh_CN.md @@ -23,6 +23,7 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布 ## 🔥 近期热点 ### 框架增强 +- 引入 vLLM: [#445](https://github.com/xorbitsai/inference/pull/445) - Embedding 模型支持: [#418](https://github.com/xorbitsai/inference/pull/418) - LoRA 支持: [#271](https://github.com/xorbitsai/inference/issues/271) - PyTorch 模型多 GPU 支持: [#226](https://github.com/xorbitsai/inference/issues/226) @@ -41,7 +42,6 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布 ⚡️ **前沿模型,应有尽有**:框架内置众多中英文的前沿大语言模型,包括 baichuan,chatglm2 等,一键即可体验!内置模型列表还在快速更新中! - 🖥 **异构硬件,快如闪电**:通过 [ggml](https://github.com/ggerganov/ggml),同时使用你的 GPU 与 CPU 进行推理,降低延迟,提高吞吐! ⚙️ **接口调用,灵活多样**:提供多种使用模型的接口,包括 RPC,RESTful API,命令行,web UI 等等。方便模型的管理与监控。 diff --git a/doc/source/index.rst b/doc/source/index.rst index d1abd475ec..56177bae04 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -43,6 +43,7 @@ with popular third-party libraries like `LangChain `_ - Embedding model support: `#418 `_ - LoRA support: `#271 `_ - Multi-GPU support for PyTorch models: `#226 `_