|
2 | 2 |
|
3 | 3 | <!-- vim-markdown-toc GFM --> |
4 | 4 |
|
5 | | -* [Foobar](#foobar) |
| 5 | +* [Preface](#preface) |
| 6 | +* [Deployment methods](#deployment-methods) |
| 7 | +* [Integration with Llama Stack framework](#integration-with-llama-stack-framework) |
| 8 | + * [Llama Stack as a library](#llama-stack-as-a-library) |
| 9 | + * [Llama Stack as a server](#llama-stack-as-a-server) |
| 10 | +* [Local deployment](#local-deployment) |
| 11 | + * [Llama Stack used as a library](#llama-stack-used-as-a-library) |
| 12 | + * [Llama Stack used as a separate process](#llama-stack-used-as-a-separate-process) |
| 13 | +* [Running from container](#running-from-container) |
| 14 | + * [Llama Stack used as a library](#llama-stack-used-as-a-library-1) |
| 15 | + * [Llama Stack used as a separate process in container](#llama-stack-used-as-a-separate-process-in-container) |
| 16 | + * [Llama Stack configuration](#llama-stack-configuration) |
6 | 17 |
|
7 | 18 | <!-- vim-markdown-toc --> |
8 | 19 |
|
| 20 | +## Preface |
| 21 | + |
| 22 | +In this document, you will learn how to install and run a service called *Lightspeed Core Stack (LCS)*. It is a service that allows users to communicate with large language models (LLMs), access to RAG databases, call so called agents, process conversation history, ensure that the conversation is only about permitted topics, etc. |
| 23 | + |
| 24 | + |
| 25 | + |
9 | 26 | ## Deployment methods |
10 | 27 |
|
| 28 | +*Lightspeed Core Stack (LCS)* service is built on the Llama Stack framework, which can be run in several modes. Additionally it is possible to run *LCS* locally (as regular application written in Python programming language) or from within container. This means, that is is possible to leverage multiple deploment methods: |
| 29 | + |
| 30 | +* Local deployment |
| 31 | + - Llama Stack framework is used as a library |
| 32 | + - Llama Stack framework is used as a separate process (deployed locally) |
| 33 | +* Running from a container |
| 34 | + - Llama Stack framework is used as a library |
| 35 | + - Llama Stack framework is used as a separate process in container |
| 36 | + |
| 37 | +All those deployments methods will be covered later. |
| 38 | + |
| 39 | + |
| 40 | + |
| 41 | +## Integration with Llama Stack framework |
| 42 | + |
| 43 | +The Llama Stack framework can be run as a standalone server and accessed via its the REST API. However, instead of direct communication via the REST API (and JSON format), there is an even better alternative. It is based on the so-called Llama Stack Client. It is a library available for Python, Swift, Node.js or Kotlin, which "wraps" the REST API stack in a suitable way, which is easier for many applications. |
| 44 | + |
| 45 | + |
| 46 | + |
| 47 | +### Llama Stack as a library |
| 48 | + |
| 49 | +When this mode is selected, Llama Stack is used as a regular Python library. That means, that this library needs to be installed to system Python environment, user Python environment, or to virtual Python environment. All calls to Llama Stack is performed via standard function or method calls: |
| 50 | + |
11 | 51 |  |
| 52 | + |
| 53 | +[!NOTE] |
| 54 | +Even when Llama Stack is used as a library, it still requires the configuration file `run.yaml` to be presented. This configuration file is loaded during initialization phase. |
| 55 | + |
| 56 | + |
| 57 | + |
| 58 | +### Llama Stack as a server |
| 59 | + |
| 60 | +When this mode is selected, Llama Stack is started as a separate REST API service. All communications with Llama Stack is thus done via REST API calls, which in turn means, that Llama Stack can run on separate machine if needed. |
| 61 | + |
12 | 62 |  |
13 | 63 |
|
| 64 | +[!NOTE] |
| 65 | +The REST API scheme and also semantic can change at any time, especially before the official version 1.0.0 will be released. By using *Lighspeed Core Service* developers, users, and customers are isolated from these incompatibilities. |
| 66 | + |
| 67 | + |
| 68 | + |
14 | 69 | ## Local deployment |
15 | 70 |
|
16 | | -### Llama Stack used as a library |
| 71 | +In this chapter it will be shown how to run LCS locally. This mode is especially useful for developers, as it is possible to work with the latest versions of source codes, including locally made changes and improvements. And last but not least, it is possible to trace, monitor and debug the entire system from within integrated development environment etc. |
| 72 | + |
| 73 | + |
17 | 74 |
|
18 | 75 | ### Llama Stack used as a separate process |
19 | 76 |
|
20 | | -## Running from container |
| 77 | +The easiest option is to run Llama Stack in a separate process. This means that there will at least be two running processes involved: |
| 78 | + |
| 79 | +1. Llama Stack framework with open port 8321 (can be easily changed if needed) |
| 80 | +1. LCS with open port 8080 (can be easily changed if needed) |
21 | 81 |
|
22 | 82 | ### Llama Stack used as a library |
23 | 83 |
|
| 84 | +## Running from container |
| 85 | + |
24 | 86 | ### Llama Stack used as a separate process in container |
25 | 87 |
|
| 88 | +### Llama Stack used as a library |
| 89 | + |
| 90 | + |
26 | 91 |
|
27 | 92 |
|
28 | 93 | ```toml |
|
0 commit comments