You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cerebras provides high-speed inference with competitive pricing for Llama models. Their infrastructure is optimized for fast token generation, making them ideal for development and high-throughput automation tasks.
184
+
185
+
### Available Models
186
+
187
+
| Model | Size | Best For | Speed |
188
+
|-------|------|----------|-------|
189
+
|`cerebras-llama-3.3-70b`| 70B parameters | Complex reasoning, production | Fast |
190
+
|`cerebras-llama-3.1-8b`| 8B parameters | Development, simple tasks | Very Fast |
191
+
|`cerebras-qwen-3-32b`| 32B parameters | Balanced performance, general use | Fast |
4.**Configure Client**: Use the CerebrasClient for optimal performance
227
+
156
228
## Custom LLM Integration
157
229
158
230
<Note>
@@ -248,6 +320,7 @@ For each provider, use their latest models that meet these requirements. Some ex
248
320
-**OpenAI**: GPT-4 series or newer
249
321
-**Anthropic**: Claude 3 series or newer
250
322
-**Google**: Gemini 2 series or newer
323
+
-**Cerebras**: Llama 3.1+ series (both 8B and 70B models supported)
251
324
-**Other providers**: Latest models with structured output support
252
325
253
326
**Note**: Avoid base language models without structured output capabilities or fine-tuning for instruction following. When in doubt, check our [Model Evaluation](https://www.stagehand.dev/evals) page for up-to-date recommendations.
0 commit comments