Skip to content

Commit 30e26fc

Browse files
committed
first commit
Signed-off-by: PeaBrane <yanrpei@gmail.com>
1 parent 3c7c1d6 commit 30e26fc

File tree

16 files changed

+29
-29
lines changed

16 files changed

+29
-29
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ Dynamo provides a simple way to spin up a local set of inference components incl
120120
```
121121
# Start an OpenAI compatible HTTP server, a pre-processor (prompt templating and tokenization) and a router.
122122
# Pass the TLS certificate and key paths to use HTTPS instead of HTTP.
123-
python -m dynamo.frontend --http-port 8080 [--tls-cert-path cert.pem] [--tls-key-path key.pem]
123+
python -m dynamo.frontend --http-port 8000 [--tls-cert-path cert.pem] [--tls-key-path key.pem]
124124
125125
# Start the SGLang engine, connecting to NATS and etcd to receive requests. You can run several of these,
126126
# both for the same model and for multiple models. The frontend node will discover them.
@@ -130,7 +130,7 @@ python -m dynamo.sglang.worker --model deepseek-ai/DeepSeek-R1-Distill-Llama-8B
130130
#### Send a Request
131131

132132
```bash
133-
curl localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
133+
curl localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
134134
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
135135
"messages": [
136136
{

components/backends/mocker/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ python -m dynamo.mocker \
3737
--enable-prefix-caching
3838

3939
# Start frontend server
40-
python -m dynamo.frontend --http-port 8080
40+
python -m dynamo.frontend --http-port 8000
4141
```
4242

4343
### Legacy JSON file support:

components/backends/vllm/deepseek-r1.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ node 1
2626
On node 0 (where the frontend was started) send a test request to verify your deployment:
2727

2828
```bash
29-
curl localhost:8080/v1/chat/completions \
29+
curl localhost:8000/v1/chat/completions \
3030
-H "Content-Type: application/json" \
3131
-d '{
3232
"model": "deepseek-ai/DeepSeek-R1",

components/backends/vllm/deploy/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -197,7 +197,7 @@ See the [vLLM CLI documentation](https://docs.vllm.ai/en/v0.9.2/configuration/se
197197
Send a test request to verify your deployment:
198198

199199
```bash
200-
curl localhost:8080/v1/chat/completions \
200+
curl localhost:8000/v1/chat/completions \
201201
-H "Content-Type: application/json" \
202202
-d '{
203203
"model": "Qwen/Qwen3-0.6B",

components/frontend/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Dynamo frontend node.
22

3-
Usage: `python -m dynamo.frontend [--http-port 8080]`.
3+
Usage: `python -m dynamo.frontend [--http-port 8000]`.
44

55
This runs an OpenAI compliant HTTP server, a pre-processor, and a router in a single process. Engines / workers are auto-discovered when they call `register_llm`.
66

deploy/metrics/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ graph TD
2323
PROMETHEUS[Prometheus server :9090] -->|:2379/metrics| ETCD_SERVER[etcd-server :2379, :2380]
2424
PROMETHEUS -->|:9401/metrics| DCGM_EXPORTER[dcgm-exporter :9401]
2525
PROMETHEUS -->|:7777/metrics| NATS_PROM_EXP
26-
PROMETHEUS -->|:8080/metrics| DYNAMOFE[Dynamo HTTP FE :8080]
26+
PROMETHEUS -->|:8000/metrics| DYNAMOFE[Dynamo HTTP FE :8000]
2727
PROMETHEUS -->|:8081/metrics| DYNAMOBACKEND[Dynamo backend :8081]
2828
DYNAMOFE --> DYNAMOBACKEND
2929
GRAFANA -->|:9090/query API| PROMETHEUS

docs/architecture/dynamo_flow.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ This diagram shows the NVIDIA Dynamo disaggregated inference system as implement
2323
The primary user journey through the system:
2424

2525
1. **Discovery (S1)**: Client discovers the service endpoint
26-
2. **Request (S2)**: HTTP client sends API request to Frontend (OpenAI-compatible server on port 8080)
26+
2. **Request (S2)**: HTTP client sends API request to Frontend (OpenAI-compatible server on port 8000)
2727
3. **Validate (S3)**: Frontend forwards request to Processor for validation and routing
2828
4. **Route (S3)**: Processor routes the validated request to appropriate Decode Worker
2929

@@ -84,7 +84,7 @@ graph TD
8484
%% Top Layer - Client & Frontend
8585
Client["<b>HTTP Client</b>"]
8686
S1[["<b>1 DISCOVERY</b>"]]
87-
Frontend["<b>Frontend</b><br/><i>OpenAI Compatible Server<br/>Port 8080</i>"]
87+
Frontend["<b>Frontend</b><br/><i>OpenAI Compatible Server<br/>Port 8000</i>"]
8888
S2[["<b>2 REQUEST</b>"]]
8989
9090
%% Processing Layer

docs/components/router/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,12 @@ The Dynamo KV Router intelligently routes requests by evaluating their computati
1414
To launch the Dynamo frontend with the KV Router:
1515

1616
```bash
17-
python -m dynamo.frontend --router-mode kv --http-port 8080
17+
python -m dynamo.frontend --router-mode kv --http-port 8000
1818
```
1919

2020
This command:
2121
- Launches the Dynamo frontend service with KV routing enabled
22-
- Exposes the service on port 8080 (configurable)
22+
- Exposes the service on port 8000 (configurable)
2323
- Automatically handles all backend workers registered to the Dynamo endpoint
2424

2525
Backend workers register themselves using the `register_llm` API, after which the KV Router automatically:

docs/guides/dynamo_deploy/create_deployment.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ Here's a template structure based on the examples:
8888
Consult the corresponding sh file. Each of the python commands to launch a component will go into your yaml spec under the
8989
`extraPodSpec: -> mainContainer: -> args:`
9090

91-
The front end is launched with "python3 -m dynamo.frontend [--http-port 8080] [--router-mode kv]"
91+
The front end is launched with "python3 -m dynamo.frontend [--http-port 8000] [--router-mode kv]"
9292
Each worker will launch `python -m dynamo.YOUR_INFERENCE_BACKEND --model YOUR_MODEL --your-flags `command.
9393
If you are a Dynamo contributor the [dynamo run guide](../dynamo_run.md) for details on how to run this command.
9494

docs/guides/dynamo_run.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,12 +72,12 @@ You can also list models or send a request:
7272

7373
*List the models*
7474
```
75-
curl localhost:8080/v1/models
75+
curl localhost:8000/v1/models
7676
```
7777

7878
*Send a request*
7979
```
80-
curl -d '{"model": "Llama-3.2-3B-Instruct-Q4_K_M", "max_completion_tokens": 2049, "messages":[{"role":"user", "content": "What is the capital of South Africa?" }]}' -H 'Content-Type: application/json' http://localhost:8080/v1/chat/completions
80+
curl -d '{"model": "Llama-3.2-3B-Instruct-Q4_K_M", "max_completion_tokens": 2049, "messages":[{"role":"user", "content": "What is the capital of South Africa?" }]}' -H 'Content-Type: application/json' http://localhost:8000/v1/chat/completions
8181
```
8282

8383
## Distributed System

0 commit comments

Comments
 (0)