Skip to content

Commit

Permalink
Merge pull request #1853 from deep-diver/example/nim
Browse files Browse the repository at this point in the history
Adding an example of NIM
  • Loading branch information
Bihan authored Nov 12, 2024
2 parents 537b43b + 746a37c commit f5aa2eb
Show file tree
Hide file tree
Showing 3 changed files with 126 additions and 0 deletions.
88 changes: 88 additions & 0 deletions examples/deployment/nim/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# NIM

This example shows how to deploy `Meta/LLama3-8b-instruct` with `dstack` using [NIM :material-arrow-top-right-thin:{ .external }](https://docs.nvidia.com/nim/large-language-models/latest/getting-started.html).

??? info "Prerequisites"
Once `dstack` is [installed](https://dstack.ai/docs/installation), go ahead clone the repo, and run `dstack init`.

<div class="termy">

```shell
$ git clone https://github.com/dstackai/dstack
$ cd dstack
$ dstack init
```

</div>

## Deployment

### Running as a task
If you'd like to run Meta/Llama 3-8b for development purposes, consider using `dstack` [tasks](https://dstack.ai/docs/tasks/).

<div editor-title="examples/deployment/nim/task.dstack.yml">

```yaml
type: task

name: llama3-nim-task
image: nvcr.io/nim/meta/llama3-8b-instruct:latest

env:
- NGC_API_KEY
registry_auth:
username: $oauthtoken
password: ${{ env.NGC_API_KEY }}

ports:
- 8000

spot_policy: auto

resources:
gpu: 24GB

backends: ["aws", "azure", "cudo", "datacrunch", "gcp", "lambda", "oci", "tensordock"]
```
</div>
Note, Currently NIM is supported on every backend except RunPod and Vast.ai.
### Deploying as a service
If you'd like to deploy the model as an auto-scalable and secure endpoint,
use the [service](https://dstack.ai/docs/services) configuration. You can find it at [`examples/deployment/nim/service.dstack.yml` :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/deployment/nim/service.dstack.yml)

### Running a configuration

To run a configuration, use the [`dstack apply`](https://dstack.ai/docs/reference/cli/index.md#dstack-apply) command.

<div class="termy">

```shell
$ NGC_API_KEY=...
$ dstack apply -f examples/deployment/nim/task.dstack.yml
# BACKEND REGION RESOURCES SPOT PRICE
1 gcp asia-northeast3 4xCPU, 16GB, 1xL4 (24GB) yes $0.17
2 gcp asia-east1 4xCPU, 16GB, 1xL4 (24GB) yes $0.21
3 gcp asia-northeast3 8xCPU, 32GB, 1xL4 (24GB) yes $0.21
Submit the run llama3-nim-task? [y/n]: y
Provisioning...
---> 100%
```
</div>


## Source code

The source-code of this example can be found in
[`examples/deployment/nim` :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/deployment/nim).

## What's next?

1. Check [dev environments](https://dstack.ai/docs/dev-environments), [tasks](https://dstack.ai/docs/tasks),
[services](https://dstack.ai/docs/services), and [protips](https://dstack.ai/docs/protips).
2. Browse [Available models in NGC Catalog :material-arrow-top-right-thin:{ .external }](https://catalog.ngc.nvidia.com/containers?filters=nvidia_nim%7CNVIDIA+NIM%7Cnimmcro_nvidia_nim&orderBy=scoreDESC&query=&page=&pageSize=).
18 changes: 18 additions & 0 deletions examples/deployment/nim/service.dstack.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
type: service

image: nvcr.io/nim/meta/llama3-8b-instruct:latest

env:
- NGC_API_KEY
registry_auth:
username: $oauthtoken
password: ${{ env.NGC_API_KEY }}

port: 8000

spot_policy: auto

resources:
gpu: 24GB

backends: ["aws", "azure", "cudo", "datacrunch", "gcp", "lambda", "oci", "tensordock"]
20 changes: 20 additions & 0 deletions examples/deployment/nim/task.dstack.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
type: task

name: llama3-nim-task
image: nvcr.io/nim/meta/llama3-8b-instruct:latest

env:
- NGC_API_KEY
registry_auth:
username: $oauthtoken
password: ${{ env.NGC_API_KEY }}

ports:
- 8000

spot_policy: auto

resources:
gpu: 24GB

backends: ["aws", "azure", "cudo", "datacrunch", "gcp", "lambda", "oci", "tensordock"]

0 comments on commit f5aa2eb

Please sign in to comment.