Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev2049/openllm #6578

Merged
merged 2 commits into from
Jun 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions docs/extras/ecosystem/integrations/openllm.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# OpenLLM

This page demonstrates how to use [OpenLLM](https://github.com/bentoml/OpenLLM)
with LangChain.

`OpenLLM` is an open platform for operating large language models (LLMs) in
production. It enables developers to easily run inference with any open-source
LLMs, deploy to the cloud or on-premises, and build powerful AI apps.

## Installation and Setup

Install the OpenLLM package via PyPI:

```bash
pip install openllm
```

## LLM

OpenLLM supports a wide range of open-source LLMs as well as serving users' own
fine-tuned LLMs. Use `openllm model` command to see all available models that
are pre-optimized for OpenLLM.

## Wrappers

There is a OpenLLM Wrapper which supports loading LLM in-process or accessing a
remote OpenLLM server:

```python
from langchain.llms import OpenLLM
```

### Wrapper for OpenLLM server

This wrapper supports connecting to an OpenLLM server via HTTP or gRPC. The
OpenLLM server can run either locally or on the cloud.

To try it out locally, start an OpenLLM server:

```bash
openllm start flan-t5
```

Wrapper usage:

```python
from langchain.llms import OpenLLM

llm = OpenLLM(server_url='http://localhost:3000')

llm("What is the difference between a duck and a goose? And why there are so many Goose in Canada?")
```

### Wrapper for Local Inference

You can also use the OpenLLM wrapper to load LLM in current Python process for
running inference.

```python
from langchain.llms import OpenLLM

llm = OpenLLM(model_name="dolly-v2", model_id='databricks/dolly-v2-7b')

llm("What is the difference between a duck and a goose? And why there are so many Goose in Canada?")
```

### Usage

For a more detailed walkthrough of the OpenLLM Wrapper, see the
[example notebook](../modules/models/llms/integrations/openllm.ipynb)
5 changes: 3 additions & 2 deletions docs/extras/guides/deployments/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@ This guide aims to provide a comprehensive overview of the requirements for depl
Understanding these components is crucial when assessing serving systems. LangChain integrates with several open-source projects designed to tackle these issues, providing a robust framework for productionizing your LLM applications. Some notable frameworks include:

- [Ray Serve](/docs/ecosystem/integrations/ray_serve.html)
- [BentoML](https://github.com/ssheng/BentoChain)
- [BentoML](https://github.com/bentoml/BentoML)
- [OpenLLM](/docs/ecosystem/integrations/openllm.html)
- [Modal](/docs/ecosystem/integrations/modal.html)

These links will provide further information on each ecosystem, assisting you in finding the best fit for your LLM deployment needs.
Expand Down Expand Up @@ -110,4 +111,4 @@ Rapid iteration also involves the ability to recreate your infrastructure quickl

## CI/CD

In a fast-paced environment, implementing CI/CD pipelines can significantly speed up the iteration process. They help automate the testing and deployment of your LLM applications, reducing the risk of errors and enabling faster feedback and iteration.
In a fast-paced environment, implementing CI/CD pipelines can significantly speed up the iteration process. They help automate the testing and deployment of your LLM applications, reducing the risk of errors and enabling faster feedback and iteration.
5 changes: 5 additions & 0 deletions docs/extras/guides/deployments/template_repos.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,11 @@ This repository allows users to serve local chains and agents as RESTful, gRPC,

This repository provides an example of how to deploy a LangChain application with [BentoML](https://github.com/bentoml/BentoML). BentoML is a framework that enables the containerization of machine learning applications as standard OCI images. BentoML also allows for the automatic generation of OpenAPI and gRPC endpoints. With BentoML, you can integrate models from all popular ML frameworks and deploy them as microservices running on the most optimal hardware and scaling independently.

## [OpenLLM](https://github.com/bentoml/OpenLLM)

OpenLLM is a platform for operating large language models (LLMs) in production. With OpenLLM, you can run inference with any open-source LLM, deploy to the cloud or on-premises, and build powerful AI apps. It supports a wide range of open-source LLMs, offers flexible APIs, and first-class support for LangChain and BentoML.
See OpenLLM's [integration doc](https://github.com/bentoml/OpenLLM#%EF%B8%8F-integrations) for usage with LangChain.

## [Databutton](https://databutton.com/home?new-data-app=true)

These templates serve as examples of how to build, deploy, and share LangChain applications using Databutton. You can create user interfaces with Streamlit, automate tasks by scheduling Python code, and store files and data in the built-in store. Examples include a Chatbot interface with conversational memory, a Personal search engine, and a starter template for LangChain apps. Deploying and sharing is just one click away.
159 changes: 159 additions & 0 deletions docs/extras/modules/model_io/models/llms/integrations/openllm.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "026cc336",
"metadata": {},
"source": [
"# OpenLLM\n",
"\n",
"[🦾 OpenLLM](https://github.com/bentoml/OpenLLM) is an open platform for operating large language models (LLMs) in production. It enables developers to easily run inference with any open-source LLMs, deploy to the cloud or on-premises, and build powerful AI apps."
]
},
{
"cell_type": "markdown",
"id": "da0ddca1",
"metadata": {},
"source": [
"## Installation\n",
"\n",
"Install `openllm` through [PyPI](https://pypi.org/project/openllm/)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6601c03b",
"metadata": {},
"outputs": [],
"source": [
"!pip install openllm"
]
},
{
"cell_type": "markdown",
"id": "90174fe3",
"metadata": {},
"source": [
"## Launch OpenLLM server locally\n",
"\n",
"To start an LLM server, use `openllm start` command. For example, to start a dolly-v2 server, run the following command from a terminal:\n",
"\n",
"```bash\n",
"openllm start dolly-v2\n",
"```\n",
"\n",
"\n",
"## Wrapper"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "35b6bf60",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenLLM\n",
"\n",
"server_url = \"http://localhost:3000\" # Replace with remote host if you are running on a remote server \n",
"llm = OpenLLM(server_url=server_url)"
]
},
{
"cell_type": "markdown",
"id": "4f830f9d",
"metadata": {},
"source": [
"### Optional: Local LLM Inference\n",
"\n",
"You may also choose to initialize an LLM managed by OpenLLM locally from current process. This is useful for development purpose and allows developers to quickly try out different types of LLMs.\n",
"\n",
"When moving LLM applications to production, we recommend deploying the OpenLLM server separately and access via the `server_url` option demonstrated above.\n",
"\n",
"To load an LLM locally via the LangChain wrapper:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "82c392b6",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenLLM\n",
"\n",
"llm = OpenLLM(\n",
" model_name=\"dolly-v2\",\n",
" model_id=\"databricks/dolly-v2-3b\",\n",
" temperature=0.94,\n",
" repetition_penalty=1.2,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f15ebe0d",
"metadata": {},
"source": [
"### Integrate with a LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "8b02a97a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"iLkb\n"
]
}
],
"source": [
"from langchain import PromptTemplate, LLMChain\n",
"\n",
"template = \"What is a good name for a company that makes {product}?\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"product\"])\n",
"\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"generated = llm_chain.run(product=\"mechanical keyboard\")\n",
"print(generated)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56cb4bc0",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
4 changes: 4 additions & 0 deletions langchain/llms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
from langchain.llms.mosaicml import MosaicML
from langchain.llms.nlpcloud import NLPCloud
from langchain.llms.openai import AzureOpenAI, OpenAI, OpenAIChat
from langchain.llms.openllm import OpenLLM
from langchain.llms.openlm import OpenLM
from langchain.llms.petals import Petals
from langchain.llms.pipelineai import PipelineAI
Expand Down Expand Up @@ -81,6 +82,7 @@
"NLPCloud",
"OpenAI",
"OpenAIChat",
"OpenLLM",
"OpenLM",
"Petals",
"PipelineAI",
Expand Down Expand Up @@ -138,5 +140,7 @@
"self_hosted_hugging_face": SelfHostedHuggingFaceLLM,
"stochasticai": StochasticAI,
"vertexai": VertexAI,
"openllm": OpenLLM,
"openllm_client": OpenLLM,
"writer": Writer,
}
Loading