From d83b2430974113fb604e3d2d9f9d02d780ba174a Mon Sep 17 00:00:00 2001
From: Elieser Pereira <elieser.pereiraa@gmail.com>
Date: Fri, 3 Oct 2025 12:27:10 -0300
Subject: [PATCH] Update production-stack.md

update api versions used  to test

Signed-off-by: Elieser Pereira <elieser.pereiraa@gmail.com>
---
 docs/deployment/integrations/production-stack.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/deployment/integrations/production-stack.md b/docs/deployment/integrations/production-stack.md
index fae392589c06..2f1894ccf002 100644
--- a/docs/deployment/integrations/production-stack.md
+++ b/docs/deployment/integrations/production-stack.md
@@ -55,7 +55,7 @@ sudo kubectl port-forward svc/vllm-router-service 30080:80
 And then you can send out a query to the OpenAI-compatible API to check the available models:
 
 ```bash
-curl -o- http://localhost:30080/models
+curl -o- http://localhost:30080/v1/models
 ```
 
 ??? console "Output"
@@ -78,7 +78,7 @@ curl -o- http://localhost:30080/models
 To send an actual chatting request, you can issue a curl request to the OpenAI `/completion` endpoint:
 
 ```bash
-curl -X POST http://localhost:30080/completions \
+curl -X POST http://localhost:30080/v1/completions \
   -H "Content-Type: application/json" \
   -d '{
     "model": "facebook/opt-125m",