diff --git a/README.md b/README.md index 792203a6a4..a36a92dfde 100644 --- a/README.md +++ b/README.md @@ -68,12 +68,13 @@ pip install "ai-dynamo[all]" ### Building the Dynamo Base Image -Although not needed for local development, deploying your Dynamo pipelines to Kubernetes will require you to build and push a Dynamo base image to your container registry. You can use any container registry of your choice, such as: +Although not needed for local development, deploying your Dynamo pipelines to Kubernetes will require you to use a Dynamo base image to your container registry. You can use any container registry of your choice, such as: - Docker Hub (docker.io) - NVIDIA NGC Container Registry (nvcr.io) - Any private registry -Here's how to build it: +We publish our images in [nvcr.io](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime) and you can use them. +Alternatively you could build and push an image from source: ```bash ./container/build.sh @@ -83,8 +84,10 @@ docker push /dynamo-base:latest-vllm ``` Notes about builds for specific frameworks: -- For specific details on the `--framework vllm` build, see [here](examples/vllm/README.md). -- For specific details on the `--framework tensorrtllm` build, see [here](examples/tensorrt_llm/README.md). +- For specific details on the `--framework vllm` build [read about the VLLM backend](components/backends/vllm/README.md) +. +- For specific details on the `--framework tensorrtllm` build, see [Read about the TensorRT-LLM backend](components/backends/trtllm/README.md) +. Note about AWS environments: - If deploying Dynamo in AWS, make sure to build the container with EFA support using the `--make-efa` flag. @@ -197,8 +200,6 @@ pip install . cd ../../../ pip install ".[all]" -# To test -docker compose -f deploy/metrics/docker-compose.yml up -d -cd examples/llm -dynamo serve graphs.agg:Frontend -f configs/agg.yaml +Follow the [Quickstart Guide](docs/guides/dynamo_deploy/quickstart.md) + ``` \ No newline at end of file diff --git a/deploy/helm/README.md b/deploy/helm/README.md index 8f847e2cdd..7c1b404108 100644 --- a/deploy/helm/README.md +++ b/deploy/helm/README.md @@ -29,8 +29,10 @@ This approach allows you to install Dynamo directly using a DynamoGraphDeploymen ### Basic Installation +Here is how you would install a VLLM inference backend example. + ```bash -helm upgrade --install dynamo-graph ./deploy/helm/chart -n dynamo-cloud -f ./examples/vllm/deploy/agg.yaml +helm upgrade --install dynamo-graph ./deploy/helm/chart -n dynamo-cloud -f ./components/backends/vllm/deploy/agg.yaml ``` ### Customizable Properties @@ -39,7 +41,7 @@ You can override the default configuration by setting the following properties: ```bash helm upgrade --install dynamo-graph ./deploy/helm/chart -n dynamo-cloud \ - -f ./examples/vllm/deploy/agg.yaml \ + -f ./components/backends/vllm/deploy/agg.yaml \ --set "imagePullSecrets[0].name=docker-secret-1" \ --set etcdAddr="my-etcd-service:2379" \ --set natsAddr="nats://my-nats-service:4222" diff --git a/deploy/inference-gateway/example/README.md b/deploy/inference-gateway/example/README.md index 6395790a32..5bf1615e9a 100644 --- a/deploy/inference-gateway/example/README.md +++ b/deploy/inference-gateway/example/README.md @@ -16,11 +16,7 @@ This guide provides instructions for setting up the Inference Gateway with Dynam [See Quickstart Guide](../../../docs/guides/dynamo_deploy/quickstart.md) to install Dynamo Cloud. -2. **Launch Dynamo Deployments** - -[See VLLM Example](../../../examples/vllm/README.md) - -3. **Deploy Inference Gateway** +2. **Deploy Inference Gateway** First, deploy an inference gateway service. In this example, we'll install `kgateway` based gateway implementation. @@ -54,7 +50,7 @@ kubectl get gateway inference-gateway # inference-gateway kgateway True 1m ``` -4. **Apply Dynamo-specific manifests** +3. **Apply Dynamo-specific manifests** The Inference Gateway is configured through the `inference-gateway-resources.yaml` file. diff --git a/docs/examples/README.md b/docs/examples/README.md index 977277e2b4..7c2825c7f7 100644 --- a/docs/examples/README.md +++ b/docs/examples/README.md @@ -2,7 +2,7 @@ ## Serving examples locally -TODO: Follow individual examples to serve models locally. +Follow individual examples under components/backends/ to serve models locally. ## Deploying Examples to Kubernetes @@ -38,7 +38,7 @@ export NAMESPACE= # the namespace you used to deploy Dynamo clou Deploying an example consists of the simple `kubectl apply -f ... -n ${NAMESPACE}` command. For example: ```bash -kubectl apply -f examples/vllm/deploy/agg.yaml -n ${NAMESPACE} +kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE} ``` You can use `kubectl get dynamoGraphDeployment -n ${NAMESPACE}` to view your deployment. diff --git a/docs/get_started.md b/docs/get_started.md index 87fdcfbfbc..b0a321a4ff 100644 --- a/docs/get_started.md +++ b/docs/get_started.md @@ -167,7 +167,7 @@ docker compose -f deploy/docker-compose.yml up -d ### Start Dynamo LLM Serving Components -[Explore the VLLM Example](../examples/vllm/README.md) +[Explore the VLLM Example](../components/backends/vllm/README.md) ## Local Development diff --git a/docs/guides/dynamo_deploy/quickstart.md b/docs/guides/dynamo_deploy/quickstart.md index fc2881d98a..9f999926cc 100644 --- a/docs/guides/dynamo_deploy/quickstart.md +++ b/docs/guides/dynamo_deploy/quickstart.md @@ -187,19 +187,6 @@ We provide a script to uninstall CRDs should you need a clean start. ## Explore Examples -Pick your deployment destination. - -If local - -```bash -export DYNAMO_CLOUD=http://localhost:8080 -``` - -If kubernetes -```bash -export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com -``` - If deploying to Kubernetes, create a Kubernetes secret containing your sensitive values if needed: ```bash