feat: adding gpt 5.1 to stagehand #1264

Kylejeong2 · 2025-11-13T21:11:13Z

why

we want to benchmark/use the sota models in stagehand

what changed

added gpt 5.1 to available models, setting reasoning to low for gpt 5.1 specifically since it cannot be set to minimal.

test plan

changeset-bot · 2025-11-13T21:11:18Z

🦋 Changeset detected

Latest commit: b0b1087

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages

Name	Type
@browserbasehq/stagehand	Patch
@browserbasehq/stagehand-evals	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

packages/core/lib/v3/llm/LLMProvider.ts

packages/core/lib/v3/types/public/model.ts

miguelg719 · 2025-11-13T21:13:54Z

packages/core/lib/v3/llm/aisdk.ts

                openai: {
                  textVerbosity: "low", // Making these the default for gpt-5 for now
-                  reasoningEffort: "minimal",
+                  reasoningEffort: isGPT51 ? "low" : "minimal",


any reason to make 5.1 low vs 5 minimal? is it faster?

its a weird syntax change, you have to set to low because the model doesn't accept minimal as a param

@miguelg719

can also apparently set the reasoning to none

https://platform.openai.com/docs/guides/latest-model

greptile-apps · 2025-11-13T21:14:04Z

Greptile Overview

Greptile Summary

This PR adds support for GPT-5.1 (gpt-5.1-2025-11-13) to Stagehand, enabling benchmarking and usage of OpenAI's latest model.

Key Changes:

Added gpt-5.1-2025-11-13 to available model types and provider mappings
Configured GPT-5.1 to use reasoningEffort: "low" instead of "minimal" (which it doesn't support)
Updated both core AI SDK client and evaluation wrapper with consistent logic
Added model to evaluation suite for testing

Implementation:
The implementation correctly handles GPT-5.1 as a special case within the GPT-5 family by checking if the model ID contains "gpt-5.1" after confirming it's a GPT-5 model, then setting the appropriate reasoning effort parameter.

Confidence Score: 5/5

This PR is safe to merge - straightforward model addition with appropriate configuration
The changes are minimal, focused, and consistent across all necessary files. The logic for detecting GPT-5.1 and setting the correct reasoning effort parameter is sound. All model registration points have been updated (type definitions, provider mappings, and eval configs). The implementation mirrors existing patterns for other GPT-5 models.
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
packages/core/lib/v3/llm/aisdk.ts	5/5	Added GPT-5.1 detection and sets `reasoningEffort` to "low" for GPT-5.1 (vs "minimal" for other GPT-5 models)
packages/core/lib/v3/types/public/model.ts	5/5	Added `gpt-5.1-2025-11-13` to the `AvailableModel` type union
packages/evals/lib/AISdkClientWrapped.ts	5/5	Mirrored the changes in aisdk.ts: added GPT-5.1 detection and reasoning effort configuration

Sequence Diagram

sequenceDiagram
    participant Client as Client Code
    participant LLMProvider as LLMProvider
    participant AISdkClient as AISdkClient/Wrapped
    participant AISDK as Vercel AI SDK

    Client->>LLMProvider: Request with model: "gpt-5.1-2025-11-13"
    LLMProvider->>LLMProvider: Check modelToProviderMap
    LLMProvider->>LLMProvider: Maps to "openai" provider
    LLMProvider->>AISdkClient: Initialize with model
    
    Client->>AISdkClient: createChatCompletion()
    AISdkClient->>AISdkClient: Check isGPT5 = includes("gpt-5") → true
    AISdkClient->>AISdkClient: Check isGPT51 = includes("gpt-5.1") → true
    
    alt isGPT5 is true
        AISdkClient->>AISdkClient: Set providerOptions
        AISdkClient->>AISdkClient: textVerbosity: "low"
        alt isGPT51 is true
            AISdkClient->>AISdkClient: reasoningEffort: "low"
        else other GPT-5 models
            AISdkClient->>AISdkClient: reasoningEffort: "minimal"
        end
    end
    
    AISdkClient->>AISDK: generateObject() with providerOptions
    AISDK->>Client: Return response

greptile-apps

_{5 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

packages/core/lib/v3/types/public/model.ts

packages/evals/taskConfig.ts

Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com>

# why we want to benchmark/use the sota models in stagehand # what changed added gpt 5.1 to available models, setting reasoning to low for gpt 5.1 specifically since it cannot be set to minimal. # test plan --------- Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com>