Skip to content

Conversation

@Kylejeong2
Copy link
Member

why

we want to benchmark/use the sota models in stagehand

what changed

added gpt 5.1 to available models, setting reasoning to low for gpt 5.1 specifically since it cannot be set to minimal.

test plan

@changeset-bot
Copy link

changeset-bot bot commented Nov 13, 2025

🦋 Changeset detected

Latest commit: b0b1087

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@browserbasehq/stagehand Patch
@browserbasehq/stagehand-evals Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

openai: {
textVerbosity: "low", // Making these the default for gpt-5 for now
reasoningEffort: "minimal",
reasoningEffort: isGPT51 ? "low" : "minimal",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to make 5.1 low vs 5 minimal? is it faster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a weird syntax change, you have to set to low because the model doesn't accept minimal as a param

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miguelg719

can also apparently set the reasoning to none

https://platform.openai.com/docs/guides/latest-model

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 13, 2025

Greptile Overview

Greptile Summary

This PR adds support for GPT-5.1 (gpt-5.1-2025-11-13) to Stagehand, enabling benchmarking and usage of OpenAI's latest model.

Key Changes:

  • Added gpt-5.1-2025-11-13 to available model types and provider mappings
  • Configured GPT-5.1 to use reasoningEffort: "low" instead of "minimal" (which it doesn't support)
  • Updated both core AI SDK client and evaluation wrapper with consistent logic
  • Added model to evaluation suite for testing

Implementation:
The implementation correctly handles GPT-5.1 as a special case within the GPT-5 family by checking if the model ID contains "gpt-5.1" after confirming it's a GPT-5 model, then setting the appropriate reasoning effort parameter.

Confidence Score: 5/5

  • This PR is safe to merge - straightforward model addition with appropriate configuration
  • The changes are minimal, focused, and consistent across all necessary files. The logic for detecting GPT-5.1 and setting the correct reasoning effort parameter is sound. All model registration points have been updated (type definitions, provider mappings, and eval configs). The implementation mirrors existing patterns for other GPT-5 models.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
packages/core/lib/v3/llm/aisdk.ts 5/5 Added GPT-5.1 detection and sets reasoningEffort to "low" for GPT-5.1 (vs "minimal" for other GPT-5 models)
packages/core/lib/v3/types/public/model.ts 5/5 Added gpt-5.1-2025-11-13 to the AvailableModel type union
packages/evals/lib/AISdkClientWrapped.ts 5/5 Mirrored the changes in aisdk.ts: added GPT-5.1 detection and reasoning effort configuration

Sequence Diagram

sequenceDiagram
    participant Client as Client Code
    participant LLMProvider as LLMProvider
    participant AISdkClient as AISdkClient/Wrapped
    participant AISDK as Vercel AI SDK

    Client->>LLMProvider: Request with model: "gpt-5.1-2025-11-13"
    LLMProvider->>LLMProvider: Check modelToProviderMap
    LLMProvider->>LLMProvider: Maps to "openai" provider
    LLMProvider->>AISdkClient: Initialize with model
    
    Client->>AISdkClient: createChatCompletion()
    AISdkClient->>AISdkClient: Check isGPT5 = includes("gpt-5") → true
    AISdkClient->>AISdkClient: Check isGPT51 = includes("gpt-5.1") → true
    
    alt isGPT5 is true
        AISdkClient->>AISdkClient: Set providerOptions
        AISdkClient->>AISdkClient: textVerbosity: "low"
        alt isGPT51 is true
            AISdkClient->>AISdkClient: reasoningEffort: "low"
        else other GPT-5 models
            AISdkClient->>AISdkClient: reasoningEffort: "minimal"
        end
    end
    
    AISdkClient->>AISDK: generateObject() with providerOptions
    AISDK->>Client: Return response
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@Kylejeong2 Kylejeong2 force-pushed the kylejeong/stg-972-adding-gpt-51-to-stagehand branch from dd4eb87 to b0b1087 Compare November 19, 2025 21:59
@Kylejeong2 Kylejeong2 merged commit 767d168 into main Nov 19, 2025
15 checks passed
monadoid pushed a commit that referenced this pull request Nov 20, 2025
# why

we want to benchmark/use the sota models in stagehand

# what changed

added gpt 5.1 to available models, setting reasoning to low for gpt 5.1
specifically since it cannot be set to minimal.

# test plan

---------

Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants