Skip to content

Commit 426d704

Browse files
authored
address issue and some rephrases (#1251)
1 parent 2c99ff7 commit 426d704

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

docs/user-guides/advanced/nemoguard-topiccontrol-deployment.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
# Llama 3.1 NemoGuard 8B Topic Control Deployment
22

3-
The TopicControl model will be available to download as a LoRA adapter module through HuggingFace, and as an [NVIDIA NIM](https://docs.nvidia.com/nim/#nemoguard) for low latency optimized inference with [NVIDIA TensorRT-LLM](https://docs.nvidia.com/tensorrt-llm/index.html).
3+
The TopicControl model is available to download as a LoRA adapter module through Hugging Face or as an [NVIDIA TopicControl NIM microservice](https://docs.nvidia.com/nim/llama-3-1-nemoguard-8b-topiccontrol/latest/index.html) for low-latency optimized inference with [NVIDIA TensorRT-LLM](https://docs.nvidia.com/tensorrt-llm/index.html).
44

5-
This guide covers how to deploy the TopicControl model as a NIM, and how to then use the deployed NIM in a NeMo Guardrails configuration.
5+
This guide covers how to deploy the TopicControl model as a NIM microservice and use it in a NeMo Guardrails configuration.
66

77
## NIM Deployment
88

9+
Follow the instructions below to deploy the TopicControl NIM microservice and configure it in a NeMo Guardrails application.
10+
911
### Access
1012

1113
The first step is to ensure access to NVIDIA NIM assets through NGC using an NVAIE license.
@@ -37,11 +39,9 @@ docker run -it --name=$MODEL_NAME \
3739
$NIM_IMAGE
3840
```
3941

40-
#### Use the running NIM in your Guardrails App
41-
42-
Any locally running NIM exposes the standard OpenAI interface on the `v1/completions` and `v1/chat/completions` endpoints. NeMo Guardrails provides out of the box support engines that support the standard LLM interfaces. For locally deployed NIMs, you need to use the engine `nim`.
42+
### Use TopicControl NIM Microservice in NeMo Guardrails App
4343

44-
Thus, your Guardrails configuration file can look like:
44+
A locally running TopicControl NIM microservice exposes the standard OpenAI interface on the `v1/chat/completions` endpoint. NeMo Guardrails provides out-of-the-box support for engines that support the standard LLM interfaces. In Guardrails configuration, use the engine `nim` for the TopicControl NIM microservice as follows.
4545

4646
```yaml
4747
models:
@@ -67,7 +67,7 @@ A few things to note:
6767
- `parameters.model_name` in the Guardrails configuration needs to match the `$MODEL_NAME` used when running the NIM container.
6868
- The `rails` definitions should list `topic_control` as the model.
6969

70-
#### Bonus: Caching the optimized TRTLLM inference engines
70+
### Bonus: Caching the optimized TRTLLM inference engines
7171

7272
If you'd like to not build TRTLLM engines from scratch every time you run the NIM container, you can cache it in the first run by just adding a flag to mount a local directory inside the docker to store the model cache.
7373

0 commit comments

Comments
 (0)