You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cerebras provides high-speed inference with competitive pricing for Llama models. Their infrastructure is optimized for fast token generation, making them ideal for development and high-throughput automation tasks.
184
-
185
-
### Available Models
186
-
187
-
| Model | Size | Best For | Speed |
188
-
|-------|------|----------|-------|
189
-
|`cerebras-llama-3.3-70b`| 70B parameters | Complex reasoning, production | Fast |
190
-
|`cerebras-llama-3.1-8b`| 8B parameters | Development, simple tasks | Very Fast |
191
-
|`cerebras-qwen-3-32b`| 32B parameters | Balanced performance, general use | Fast |
4.**Configure Client**: Use the CerebrasClient for optimal performance
227
-
228
157
## Custom LLM Integration
229
158
230
159
<Note>
@@ -320,7 +249,6 @@ For each provider, use their latest models that meet these requirements. Some ex
320
249
-**OpenAI**: GPT-4 series or newer
321
250
-**Anthropic**: Claude 3 series or newer
322
251
-**Google**: Gemini 2 series or newer
323
-
-**Cerebras**: Llama 3.1+ series (both 8B and 70B models supported)
324
252
-**Other providers**: Latest models with structured output support
325
253
326
254
**Note**: Avoid base language models without structured output capabilities or fine-tuning for instruction following. When in doubt, check our [Model Evaluation](https://www.stagehand.dev/evals) page for up-to-date recommendations.
0 commit comments