Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions components/planner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,4 @@ For manual testing, you can use the controller_test.py file to add/remove compon

The Kubernetes backend works by updating the replicas count of the DynamoGraphDeployment custom resource. When the planner determines that workers need to be scaled up or down based on workload metrics, it uses the Kubernetes API to patch the DynamoGraphDeployment resource specification, changing the replicas count for the appropriate worker component. The Kubernetes operator then reconciles this change by creating or terminating the necessary pods. This provides a seamless autoscaling experience in Kubernetes environments without requiring manual intervention.

The Kubernetes backend will automatically be used by Planner when your pipeline is deployed with `dynamo deployment create`. By default, the planner will run in no-op mode, which means it will monitor metrics but not take scaling actions. To enable actual scaling, you should also specify `--Planner.no-operation=false`.


The Kubernetes backend will automatically be used by Planner when your pipeline is deployed using a DynamoGraphDeployment CR. By default, the planner will run in no-op mode, which means it will monitor metrics but not take scaling actions. To enable actual scaling, you should also specify `--Planner.no-operation=false`.
4 changes: 2 additions & 2 deletions deploy/inference-gateway/example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,11 @@ export DYNAMO_TAG=$(dynamo build graphs.agg:Frontend | grep "Successfully built"
```bash
# Deploy first graph
export DEPLOYMENT_NAME=llm-agg1
dynamo deployment create $DYNAMO_TAG -n $DEPLOYMENT_NAME -f ./configs/agg.yaml
# TODO: Deploy your service using a DynamoGraphDeployment CR.

# Deploy second graph
export DEPLOYMENT_NAME=llm-agg2
dynamo deployment create $DYNAMO_TAG -n $DEPLOYMENT_NAME -f ./configs/agg.yaml
# TODO: Deploy your service using a DynamoGraphDeployment CR.
```

3. **Deploy Inference Gateway**
Expand Down
4 changes: 0 additions & 4 deletions deploy/sdk/src/dynamo/sdk/cli/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
from rich.console import Console

from dynamo.sdk.cli.build import build
from dynamo.sdk.cli.deployment import app as deployment_app
from dynamo.sdk.cli.deployment import deploy
from dynamo.sdk.cli.env import env
from dynamo.sdk.cli.run import run
from dynamo.sdk.cli.serve import serve
Expand Down Expand Up @@ -76,8 +74,6 @@ def main(
context_settings={"allow_extra_args": True, "ignore_unknown_options": True},
add_help_option=False,
)(run)
cli.add_typer(deployment_app, name="deployment")
cli.command()(deploy)
cli.command()(build)

if __name__ == "__main__":
Expand Down
Loading
Loading