Skip to content

Comments

evaluation docs#2123

Merged
shagun-singh-inkeep merged 6 commits intomainfrom
evaluation-docs
Feb 18, 2026
Merged

evaluation docs#2123
shagun-singh-inkeep merged 6 commits intomainfrom
evaluation-docs

Conversation

@shagun-singh-inkeep
Copy link
Collaborator

No description provided.

@vercel
Copy link

vercel bot commented Feb 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Feb 18, 2026 4:51pm
agents-docs Ready Ready Preview, Comment Feb 18, 2026 4:51pm
agents-manage-ui Ready Ready Preview, Comment Feb 18, 2026 4:51pm

Request Review

@changeset-bot
Copy link

changeset-bot bot commented Feb 18, 2026

⚠️ No Changeset found

Latest commit: f398b46

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(4) Total Issues | Risk: Medium

🟠⚠️ Major (2) 🟠⚠️

🟠 1) agents-docs/content/typescript-sdk/evaluations.mdx:9-11 Missing EvaluationClient initialization example

Issue: The documentation references client.createEvaluator(), client.createEvaluationSuiteConfig(), and client.triggerBatchEvaluation() but never shows how to instantiate the EvaluationClient. Users copying the code examples will encounter undefined client errors.

Why: This is a blocking usability issue. Developers cannot use any of the documented examples without first knowing how to create an EvaluationClient instance, what parameters it requires (tenantId, projectId, apiUrl, apiKey), and where to import it from.

Fix: Add a Quick Start or Setup section at the beginning of the page showing how to instantiate the client:

import { EvaluationClient } from '@inkeep/agents-sdk';

const client = new EvaluationClient({
  tenantId: 'YOUR_TENANT_ID',
  projectId: 'YOUR_PROJECT_ID',
  apiUrl: 'https://api.inkeep.com', // or your self-hosted URL
  apiKey: process.env.INKEEP_API_KEY,
});

Refs:


🟠 2) agents-docs/content/typescript-sdk/evaluations.mdx:4-9 Datasets mentioned but not documented

Issue: The frontmatter description says "Manage datasets, evaluators, and run evaluations" and line 9 mentions "manage datasets... trigger batch evaluations or dataset runs" — but the page contains no documentation for dataset operations. The EvaluationClient has extensive dataset methods (createDataset, listDatasets, createDatasetItem, triggerDatasetRun) that are completely undocumented.

Why: Users searching for dataset documentation will find this page via the keywords/description but won't find what they need. This creates a mismatch between what the page promises and what it delivers.

Fix: Either:

  1. Add a Datasets section documenting createDataset, createDatasetItem, triggerDatasetRun, etc., OR
  2. Remove datasets from the description/keywords if intentionally out of scope:
    • Update description to: "Manage evaluators and run evaluations programmatically with the TypeScript SDK"
    • Remove "datasets" and "dataset runs" from keywords

Refs:

🟡 Minor (2) 🟡

🟡 1) multi-file Terminology inconsistency: "suite config" vs "continuous test"

files: agents-docs/content/typescript-sdk/evaluations.mdx, agents-docs/content/visual-builder/evaluations.mdx

Issue: The SDK docs introduce "evaluation suite configs" while the Visual Builder docs call the same concept "continuous tests". The SDK docs attempt to bridge with "(evaluation run configs)" parenthetical but this conflates two different API entities (suite configs vs run configs are separate resources).

Why: Customers reading both docs will be confused about the relationship between these terms. The Visual Builder UI uses "Continuous Tests" tab label, so users of that interface won't recognize "evaluation suite config" terminology.

Fix: Establish clearer terminology mapping:

  • Add a brief callout in the SDK docs explaining: "In the Visual Builder UI, Evaluation Suite Configs are the underlying configuration for what the UI calls 'Continuous Tests'."
  • Or use consistent user-facing terminology across both docs.

Refs:


🟡 2) agents-docs/content/typescript-sdk/evaluations.mdx:8-9 Missing Overview section header

Issue: The page jumps directly into introductory prose without an ## Overview heading. This is inconsistent with peer TypeScript SDK documentation pages.

Why: Peer SDK files (skills.mdx, context-fetchers.mdx, memory.mdx, external-agents.mdx) all begin with an ## Overview section after frontmatter. Consistent structure aids navigation and scannability.

Fix: Add ## Overview heading before the introductory paragraph. Same applies to the Visual Builder evaluations page.

Refs:

Inline Comments:

  • 🟡 Minor: evaluations.mdx:12 Missing blank line before heading
  • 🟡 Minor: evaluations.mdx:66 Double space after period
  • 💭 Consider: evaluations.mdx:89 Remove "via suite configs" implementation detail

🚫 REQUEST CHANGES

Summary: Good documentation foundation for the Evaluations feature. The two blocking issues are: (1) users cannot run any examples without knowing how to instantiate EvaluationClient, and (2) the page promises dataset documentation that doesn't exist. Address these and the terminology clarity, and this will be ready to ship.

Discarded (4)
Location Issue Reason Discarded
evaluations.mdx:3 Redundant sidebarTitle (matches title) LOW confidence — developer preference, matches title is acceptable
evaluations.mdx:93 Related section naming inconsistency MEDIUM confidence — multiple conventions exist in the codebase, no clear standard
evaluationSuiteConfig example Potentially misleading evaluatorIds handling VERIFIED ACCURATE — API route does accept and process evaluatorIds
evaluations.mdx:88-89 Date range behavior not clarified Valid but not actionable — would require API investigation to document correctly
Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-docs 7 2 0 0 2 0 3
pr-review-product 5 2 1 0 0 0 2
pr-review-consistency 7 1 0 0 1 0 5
Total 19 5 1 0 3 0 10

Note: Some findings were merged across reviewers (e.g., missing client initialization was raised by both docs and product reviewers).

@github-actions github-actions bot deleted a comment from claude bot Feb 18, 2026
shagun-singh-inkeep and others added 3 commits February 18, 2026 11:22
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(2) Total Issues | Risk: Medium

Delta Review

This re-review is scoped to the 3 commits since the last automated review. The delta addressed all 3 inline comment suggestions from the prior review:

Fixed Issue
Missing blank line before ## Evaluators heading
Double space after period in continuous tests description
Removed "via suite configs" implementation detail from summary table

No new issues introduced in the delta.

🕐 Pending Recommendations (2)

These Major issues were raised in the prior review and remain unaddressed:

  • 🟠 typescript-sdk/evaluations.mdx:9-11 Missing EvaluationClient initialization example — Users cannot run any code examples without knowing how to instantiate the client (import, required params: tenantId, projectId, apiUrl, apiKey)
  • 🟠 typescript-sdk/evaluations.mdx:4-9 Datasets mentioned but not documented — The frontmatter/intro promises dataset operations but the page contains no dataset documentation. Either add dataset docs or remove the mentions.

🚫 REQUEST CHANGES

Summary: Great progress — all the inline formatting suggestions are now addressed. The two Major issues from the prior review still need attention before this is ready to ship: (1) add a Quick Start section showing how to instantiate EvaluationClient, and (2) either document the dataset operations or remove datasets from the description/keywords. Once those are resolved, this will be good to go! 🎉

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
None dispatched

Note: Delta was purely formatting fixes — no substantive changes requiring reviewer dispatch.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(1) Total Issues | Risk: Medium

Delta Review

This re-review is scoped to the 1 commit since the last automated review. The delta adds a "Setup: create a client" section (24 lines) to agents-docs/content/typescript-sdk/evaluations.mdx.

Status Issue
Addressed Missing EvaluationClient initialization example — the new Setup section correctly documents all required parameters (tenantId, projectId, apiUrl) and the optional apiKey
Verified The code example accurately matches the EvaluationClientConfig interface in packages/agents-sdk/src/evaluationClient.ts
Verified Parameter table includes correct types and descriptions

No new issues introduced in the delta. The Setup section follows documentation standards with a clear heading, runnable code example using environment variables, and a complete parameter reference table.

🕐 Pending Recommendations (1)

This Major issue was raised in prior reviews and remains unaddressed:

  • 🟠 typescript-sdk/evaluations.mdx:4-9 Datasets mentioned but not documented — The frontmatter description says "Manage datasets, evaluators, and run evaluations" and the intro references "manage datasets... trigger batch evaluations or dataset runs." However, the page contains no dataset operations documentation. The EvaluationClient has extensive dataset methods (createDataset, listDatasets, createDatasetItem, triggerDatasetRun) that remain undocumented. Either add a Datasets section or remove datasets from the description/keywords to accurately reflect the current scope.

🚫 REQUEST CHANGES

Summary: Excellent progress! The delta successfully addresses the client initialization issue — the new Setup section is accurate, well-structured, and follows documentation standards. The one remaining blocker is the datasets discrepancy: the page promises dataset documentation but doesn't deliver it. Once that's resolved (either by adding dataset docs or removing the dataset mentions from frontmatter/intro), this will be ready to ship. 🎉

Discarded (0)

No findings were discarded in this review.

Reviewers (2)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-docs 1 0 0 0 0 0 0
pr-review-product 1 0 0 0 0 1 0
Total 2 0 0 0 0 1 0

Note: The product reviewer's finding about datasets was already raised in prior reviews and is tracked as a Pending Recommendation rather than a new Main finding.

@github-actions github-actions bot deleted a comment from claude bot Feb 18, 2026
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Review

This re-review is scoped to the 1 commit since the last automated review (f398b466b). The delta updates the frontmatter and intro paragraph in agents-docs/content/typescript-sdk/evaluations.mdx (3 lines changed).

Status Change
Addressed Removed "datasets" from description — now accurately says "Manage evaluators programmatically"
Addressed Removed "datasets" and "dataset runs" from keywords
Addressed Updated intro paragraph to remove dataset mentions — now says "manage evaluators, evaluation suite configs, trigger batch evaluations"

No new issues introduced in the delta. The documentation now accurately reflects its scope.

Summary of All Addressed Issues

Over the course of this PR's review iterations, all feedback has been incorporated:

Issue Status Commit
Missing EvaluationClient initialization example ✅ Fixed 1c55dca
Missing blank line before heading ✅ Fixed 65f3b1c
Double space after period ✅ Fixed 4d88e8a
"via suite configs" implementation detail exposure ✅ Fixed 5d466d2
Datasets mentioned but not documented ✅ Fixed f398b46

✅ APPROVE

Summary: Excellent work! 🎉 All prior review feedback has been addressed. The documentation is now accurate, well-structured, and ready to ship. The TypeScript SDK and Visual Builder pages provide clear, actionable guidance for users working with evaluations.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
None dispatched

Note: Delta was a targeted fix for prior feedback — no substantive changes requiring reviewer dispatch.

@github-actions github-actions bot deleted a comment from claude bot Feb 18, 2026
@shagun-singh-inkeep shagun-singh-inkeep merged commit 54d2295 into main Feb 18, 2026
10 of 11 checks passed
@shagun-singh-inkeep shagun-singh-inkeep deleted the evaluation-docs branch February 18, 2026 16:50
@github-actions
Copy link
Contributor

🔎💬 Inkeep AI search and chat service is syncing content for source 'Inkeep Agent Framework Docs'

miles-kt-inkeep pushed a commit that referenced this pull request Feb 18, 2026
* eval docs

* Update agents-docs/content/typescript-sdk/evaluations.mdx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* Update agents-docs/content/visual-builder/evaluations.mdx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* Update agents-docs/content/visual-builder/evaluations.mdx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* claude

* claude

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
dimaMachina pushed a commit that referenced this pull request Feb 19, 2026
* eval docs

* Update agents-docs/content/typescript-sdk/evaluations.mdx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* Update agents-docs/content/visual-builder/evaluations.mdx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* Update agents-docs/content/visual-builder/evaluations.mdx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* claude

* claude

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant