Skip to content

Commit 0d9f959

Browse files
committed
feat: Python frontend / ingress node
Before: `dynamo-run in=http out=dyn [flags]` After: `python -m dynamo.ingress [flags]` No need to build or install the dynamo-run Rust binary. This will have the same performance as dynamo-run, it uses the Rust library. Note we want to use `dynamo.frontend` but the older examples still use that namespace. As soon as they are removed we can rename this.
1 parent 6cdda03 commit 0d9f959

File tree

5 files changed

+80
-1
lines changed

5 files changed

+80
-1
lines changed

components/ingress/README

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Dynamo ingress / frontend node.
2+
3+
Usage: `python -m dynamo.ingress [--http-port <port>]`. Port defaults to 8080.
4+
5+
This runs an OpenAI compliant HTTP server, a pre-processor, and a router in a single process. Engines / workers are auto-discovered when they call `register_llm`.
6+
7+
Requires `etcd` and `nats-server -js`.
8+
9+
This is the same as `dynamo-run in=http out=dyn`.

components/ingress/src/dynamo/ingress/__init__.py

Whitespace-only changes.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from dynamo.ingress.main import main
5+
6+
if __name__ == "__main__":
7+
main()
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# Usage: `python -m dynamo.ingress [args]`
5+
#
6+
# Start a frontend node. This runs:
7+
# - OpenAI HTTP server.
8+
# - Auto-discovery: Watches etcd for engine/worker registration (via `register_llm`).
9+
# - Pre-processor: Prompt templating and tokenization.
10+
# - Router, defaulting to round-robin (TODO: Add flags to enable KV routing).
11+
12+
import argparse
13+
import asyncio
14+
15+
import uvloop
16+
17+
from dynamo.llm import EngineType, EntrypointArgs, make_engine, run_input
18+
from dynamo.runtime import DistributedRuntime
19+
20+
21+
def parse_args():
22+
parser = argparse.ArgumentParser(
23+
description="Dynamo Frontend: HTTP+Pre-processor+Router",
24+
formatter_class=argparse.RawTextHelpFormatter, # To preserve multi-line help formatting
25+
)
26+
parser.add_argument(
27+
"--kv-cache-block-size", type=int, help="KV cache block size (u32)."
28+
)
29+
parser.add_argument(
30+
"--http-port", type=int, default=8080, help="HTTP port for the engine (u16)."
31+
)
32+
flags = parser.parse_args()
33+
34+
kwargs = {}
35+
if flags.http_port is not None:
36+
kwargs["http_port"] = flags.http_port
37+
if flags.kv_cache_block_size is not None:
38+
kwargs["kv_cache_block_size"] = flags.kv_cache_block_size
39+
40+
return kwargs
41+
42+
43+
async def async_main():
44+
runtime = DistributedRuntime(asyncio.get_running_loop(), False)
45+
flags = parse_args()
46+
47+
# out=dyn
48+
e = EntrypointArgs(EngineType.Dynamic, **flags)
49+
engine = await make_engine(runtime, e)
50+
51+
# in=http
52+
try:
53+
await run_input(runtime, "http", engine)
54+
except asyncio.exceptions.CancelledError:
55+
pass
56+
57+
58+
def main():
59+
uvloop.run(async_main())
60+
61+
62+
if __name__ == "__main__":
63+
main()

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ requires = ["hatchling"]
8181
build-backend = "hatchling.build"
8282

8383
[tool.hatch.build.targets.wheel]
84-
packages = ["deploy/sdk/src/dynamo", "components/planner/src/dynamo"]
84+
packages = ["deploy/sdk/src/dynamo", "components/planner/src/dynamo", "components/ingress/src/dynamo"]
8585

8686
# This section is for including the binaries in the wheel package
8787
# but doesn't make them executable scripts in the venv bin directory

0 commit comments

Comments
 (0)