Commit e894e36
authored
feat: add OpenAI-compatible Bedrock provider (#3748)
Implements AWS Bedrock inference provider using OpenAI-compatible
endpoint for Llama models available through Bedrock.
Closes: #3410
## What does this PR do?
Adds AWS Bedrock as an inference provider using the OpenAI-compatible
endpoint. This lets us use Bedrock models (GPT-OSS, Llama) through the
standard llama-stack inference API.
The implementation uses LiteLLM's OpenAI client under the hood, so it
gets all the OpenAI compatibility features. The provider handles
per-request API key overrides via headers.
## Test Plan
**Tested the following scenarios:**
- Non-streaming completion - basic request/response flow
- Streaming completion - SSE streaming with chunked responses
- Multi-turn conversations - context retention across turns
- Tool calling - function calling with proper tool_calls format
# Bedrock OpenAI-Compatible Provider - Test Results
**Model:** `bedrock-inference/openai.gpt-oss-20b-1:0`
---
## Test 1: Model Listing
**Request:**
```http
GET /v1/models HTTP/1.1
```
**Response:**
```http
HTTP/1.1 200 OK
Content-Type: application/json
{
"data": [
{"identifier": "bedrock-inference/openai.gpt-oss-20b-1:0", ...},
{"identifier": "bedrock-inference/openai.gpt-oss-40b-1:0", ...}
]
}
```
---
## Test 2: Non-Streaming Completion
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
Content-Type: application/json
{
"model": "bedrock-inference/openai.gpt-oss-20b-1:0",
"messages": [{"role": "user", "content": "Say 'Hello from Bedrock' and nothing else"}],
"stream": false
}
```
**Response:**
```http
HTTP/1.1 200 OK
Content-Type: application/json
{
"choices": [{
"finish_reason": "stop",
"message": {"content": "...Hello from Bedrock"}
}],
"usage": {"prompt_tokens": 79, "completion_tokens": 50, "total_tokens": 129}
}
```
---
## Test 3: Streaming Completion
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
Content-Type: application/json
{
"model": "bedrock-inference/openai.gpt-oss-20b-1:0",
"messages": [{"role": "user", "content": "Count from 1 to 5"}],
"stream": true
}
```
**Response:**
```http
HTTP/1.1 200 OK
Content-Type: text/event-stream
[6 SSE chunks received]
Final content: "1, 2, 3, 4, 5"
```
---
## Test 4: Error Handling - Invalid Model
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
Content-Type: application/json
{
"model": "invalid-model-id",
"messages": [{"role": "user", "content": "Hello"}],
"stream": false
}
```
**Response:**
```http
HTTP/1.1 404 Not Found
Content-Type: application/json
{
"detail": "Model 'invalid-model-id' not found. Use 'client.models.list()' to list available Models."
}
```
---
## Test 5: Multi-Turn Conversation
**Request 1:**
```http
POST /v1/chat/completions HTTP/1.1
{
"messages": [{"role": "user", "content": "My name is Alice"}]
}
```
**Response 1:**
```http
HTTP/1.1 200 OK
{
"choices": [{
"message": {"content": "...Nice to meet you, Alice! How can I help you today?"}
}]
}
```
**Request 2 (with history):**
```http
POST /v1/chat/completions HTTP/1.1
{
"messages": [
{"role": "user", "content": "My name is Alice"},
{"role": "assistant", "content": "...Nice to meet you, Alice!..."},
{"role": "user", "content": "What is my name?"}
]
}
```
**Response 2:**
```http
HTTP/1.1 200 OK
{
"choices": [{
"message": {"content": "...Your name is Alice."}
}],
"usage": {"prompt_tokens": 183, "completion_tokens": 42}
}
```
**Context retained across turns**
---
## Test 6: System Messages
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
{
"messages": [
{"role": "system", "content": "You are Shakespeare. Respond only in Shakespearean English."},
{"role": "user", "content": "Tell me about the weather"}
]
}
```
**Response:**
```http
HTTP/1.1 200 OK
{
"choices": [{
"message": {"content": "Lo! I heed thy request..."}
}],
"usage": {"completion_tokens": 813}
}
```
---
## Test 7: Tool Calling
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
{
"messages": [{"role": "user", "content": "What's the weather in San Francisco?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {"type": "object", "properties": {"location": {"type": "string"}}}
}
}]
}
```
**Response:**
```http
HTTP/1.1 200 OK
{
"choices": [{
"finish_reason": "tool_calls",
"message": {
"tool_calls": [{
"function": {"name": "get_weather", "arguments": "{\"location\":\"San Francisco\"}"}
}]
}
}]
}
```
---
## Test 8: Sampling Parameters
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
{
"messages": [{"role": "user", "content": "Say hello"}],
"temperature": 0.7,
"top_p": 0.9
}
```
**Response:**
```http
HTTP/1.1 200 OK
{
"choices": [{
"message": {"content": "...Hello! 👋 How can I help you today?"}
}]
}
```
---
## Test 9: Authentication Error Handling
### Subtest A: Invalid API Key
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
x-llamastack-provider-data: {"aws_bedrock_api_key": "invalid-fake-key-12345"}
{"model": "bedrock-inference/openai.gpt-oss-20b-1:0", ...}
```
**Response:**
```http
HTTP/1.1 400 Bad Request
{
"detail": "Invalid value: Authentication failed: Error code: 401 - {'error': {'message': 'Invalid API Key format: Must start with pre-defined prefix', ...}}"
}
```
---
### Subtest B: Empty API Key (Fallback to Config)
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
x-llamastack-provider-data: {"aws_bedrock_api_key": ""}
{"model": "bedrock-inference/openai.gpt-oss-20b-1:0", ...}
```
**Response:**
```http
HTTP/1.1 200 OK
{
"choices": [{
"message": {"content": "...Hello! How can I assist you today?"}
}]
}
```
**Fell back to config key**
---
### Subtest C: Malformed Token
**Request:**
```http
POST /v1/chat/completions HTTP/1.1
x-llamastack-provider-data: {"aws_bedrock_api_key": "not-a-valid-bedrock-token-format"}
{"model": "bedrock-inference/openai.gpt-oss-20b-1:0", ...}
```
**Response:**
```http
HTTP/1.1 400 Bad Request
{
"detail": "Invalid value: Authentication failed: Error code: 401 - {'error': {'message': 'Invalid API Key format: Must start with pre-defined prefix', ...}}"
}
```1 parent a2c4c12 commit e894e36
File tree
15 files changed
+307
-188
lines changed- docs/docs/providers/inference
- src/llama_stack
- core/routers
- distributions
- ci-tests
- starter-gpu
- starter
- providers
- registry
- remote/inference/bedrock
- tests/unit/providers
- inference
15 files changed
+307
-188
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
| 19 | + | |
| 20 | + | |
29 | 21 | | |
30 | 22 | | |
31 | 23 | | |
32 | 24 | | |
33 | | - | |
| 25 | + | |
| 26 | + | |
34 | 27 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
190 | 190 | | |
191 | 191 | | |
192 | 192 | | |
193 | | - | |
| 193 | + | |
194 | 194 | | |
195 | 195 | | |
196 | 196 | | |
| |||
253 | 253 | | |
254 | 254 | | |
255 | 255 | | |
256 | | - | |
| 256 | + | |
257 | 257 | | |
258 | 258 | | |
259 | 259 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
141 | | - | |
| 141 | + | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
| 144 | + | |
| 145 | + | |
145 | 146 | | |
146 | 147 | | |
147 | 148 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| |||
Lines changed: 88 additions & 103 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | | - | |
| 7 | + | |
9 | 8 | | |
10 | | - | |
| 9 | + | |
11 | 10 | | |
12 | 11 | | |
13 | | - | |
14 | | - | |
| 12 | + | |
| 13 | + | |
15 | 14 | | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
76 | 23 | | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
| 24 | + | |
85 | 25 | | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
| 26 | + | |
91 | 27 | | |
92 | | - | |
93 | | - | |
94 | 28 | | |
95 | | - | |
96 | | - | |
97 | | - | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
98 | 32 | | |
99 | | - | |
100 | | - | |
| 33 | + | |
101 | 34 | | |
102 | | - | |
103 | | - | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
104 | 38 | | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
| 39 | + | |
| 40 | + | |
109 | 41 | | |
110 | | - | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
111 | 45 | | |
112 | | - | |
113 | | - | |
114 | | - | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
115 | 52 | | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
125 | 59 | | |
126 | 60 | | |
127 | 61 | | |
128 | 62 | | |
129 | 63 | | |
130 | | - | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
131 | 69 | | |
132 | 70 | | |
133 | 71 | | |
134 | 72 | | |
135 | 73 | | |
136 | | - | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
137 | 80 | | |
138 | 81 | | |
139 | 82 | | |
140 | 83 | | |
141 | 84 | | |
142 | | - | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
0 commit comments