From d83b2430974113fb604e3d2d9f9d02d780ba174a Mon Sep 17 00:00:00 2001 From: Elieser Pereira Date: Fri, 3 Oct 2025 12:27:10 -0300 Subject: [PATCH] Update production-stack.md update api versions used to test Signed-off-by: Elieser Pereira --- docs/deployment/integrations/production-stack.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/deployment/integrations/production-stack.md b/docs/deployment/integrations/production-stack.md index fae392589c06..2f1894ccf002 100644 --- a/docs/deployment/integrations/production-stack.md +++ b/docs/deployment/integrations/production-stack.md @@ -55,7 +55,7 @@ sudo kubectl port-forward svc/vllm-router-service 30080:80 And then you can send out a query to the OpenAI-compatible API to check the available models: ```bash -curl -o- http://localhost:30080/models +curl -o- http://localhost:30080/v1/models ``` ??? console "Output" @@ -78,7 +78,7 @@ curl -o- http://localhost:30080/models To send an actual chatting request, you can issue a curl request to the OpenAI `/completion` endpoint: ```bash -curl -X POST http://localhost:30080/completions \ +curl -X POST http://localhost:30080/v1/completions \ -H "Content-Type: application/json" \ -d '{ "model": "facebook/opt-125m",