-
Notifications
You must be signed in to change notification settings - Fork 181
feat: add temporal query parameters to MCP tools #625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add temporal query parameters to MCP tools #625
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdds temporal filtering across MCP and web APIs: new commit-time Changes
Sequence Diagram(s)sequenceDiagram
actor User as MCP Client
participant MCP as MCP Server
participant Web as Web API
participant Git as Git Operations
participant DB as Database
User->>MCP: search_commits(repoId, query, since, until, author)
MCP->>Web: POST /api/commits (validated)
Web->>DB: resolveRepoId(if string)
Web->>Web: validateDateRange(since, until)
Web->>Git: git.log --since --until --author --grep --max-count (30s timeout)
Git-->>Web: commits[]
Web->>Web: mapCommits -> SearchCommitsResponse
Web-->>MCP: searchCommitsResponse
MCP-->>User: content or ServiceError
rect rgba(200,220,255,0.7)
Note right of Git: timeout (30s) and enhanced git error mapping
end
sequenceDiagram
actor User as MCP Client
participant MCP as MCP Server
participant Web as Web API
participant DB as Database
participant Search as Code Search Service
User->>MCP: search_code(query, since, until, gitRevision)
MCP->>Web: POST /api/search
Web->>Web: parseTemporalDate(since/until)
Web->>DB: getTemporallyFilteredRepos(indexedAt range)
DB-->>Web: repos[]
Web->>Search: applyBranchFilter(if gitRevision) + repo scope
Search-->>Web: search results
Web-->>MCP: searchResponse
MCP-->>User: formatted content
rect rgba(220,255,220,0.6)
Note left of DB: temporal filtering uses index-time (lastIndexed)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas to focus during review:
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
| query = `${query} branch:${gitRevision}`; | ||
| } | ||
|
|
||
| // If since or until are provided, filter repositories by indexedAt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see the value in being able to search the commit time, but what is the benefit of being able to filter search results by indexedAt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial plan is to add temporal option in search_code, but turns out it's better implement to provide new search_commit tool specifically to get the latest code change with commit time.
For the indexedAt option of search_code, the usage could be targeting two scenarios:
- New Repos: If you add a new repository to Sourcebot today, it will have an indexedAt timestamp of today. Searching with since="today" will immediately include code from this new repository, allowing you to answer "What code was just added to our knowledge base?"
- Synced Repos: When latest changes for an existing repo and re-indexes it, the indexedAt timestamp is updated.
Scenario: A developer pushes a critical security fix and re-indexes happened on the repo.
Result: A search with since="1 hour ago" will surface matches from that repo because it was just re-processed, effectively showing you "freshly indexed code."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (10)
packages/web/src/features/search/searchApi.test.ts (1)
4-31: Unused mocks add unnecessary coupling to external modulesThis file only uses
dateUtils, but still mocks@/lib/serviceError,@/actions, and@/lib/auth. That adds extra surface area that can break these tests if those modules change, without providing value here.You can either:
- Remove these mocks from this file, or
- Add actual
searchApiintegration tests that rely on them, so the mocks are justified.packages/web/src/features/search/dateUtils.test.ts (1)
288-292: ClarifytoGitDatebehavior for unrecognized formatsThe test name and comment say “convert unrecognized format to ISO”, but the assertion expects the original string to be preserved:
// For formats git doesn't natively understand, we convert to ISO const result = toGitDate('some random string'); expect(result).toBe('some random string');This is confusing for future readers. Either:
- Update the expectation to actually check for an ISO conversion, if that’s the intended behavior, or
- Rename the test and fix the comment to state that unknown formats are passed through unchanged to Git.
packages/web/src/app/api/(server)/repos/route.ts (1)
5-20: Temporal filters are correctly forwarded; consider normalizing empty query paramsThe new GET handler cleanly threads
activeAfter/activeBeforethrough togetReposand preserves the existing error handling pattern.One minor robustness tweak:
searchParams.get(...) ?? undefinedwill pass through an empty string when the URL contains?activeAfter=or?activeBefore=. If you prefer to treat “present but empty” the same as “unset”, you could normalize with:const activeAfter = searchParams.get("activeAfter") || undefined; const activeBefore = searchParams.get("activeBefore") || undefined;This avoids feeding accidental empty strings into downstream date parsing.
packages/web/src/app/api/(server)/commits/route.ts (1)
1-34: Solid validation flow; tighten type usage / schema reuseThe POST handler’s flow (JSON → zod
safeParseAsync→schemaValidationError/serviceErrorResponse→searchCommits) looks good and matches the rest of the API.Two minor nits:
SearchCommitsRequestis imported but never used. Either remove the import or use it to type the validated payload (e.g.,const response = await searchCommits(parsed.data as SearchCommitsRequest);) so this route stays coupled to the shared request type.- If you already have a shared Zod schema for
searchCommits(e.g., alongside other search schemas), consider reusing it here instead of defining a new one inline to avoid future drift between layers.packages/mcp/src/schemas.ts (1)
195-214: Consider constrainingmaxCountto avoid pathological requestsThe new
searchCommitsRequestSchema/commitSchemapair matches the documented shape and looks consistent.
maxCountis currentlyz.number().optional(), which allows negative values or extremely large limits. Given this is likely passed through togit log(and could be user‑controlled via MCP), you may want to harden it, e.g.:maxCount: z .number() .int() .positive() .max(500) .optional(),(or whatever upper bound matches your performance expectations).
packages/web/src/features/search/dateUtils.ts (1)
6-26: ClarifyparseTemporalDatecontract and clean up unused stateThe JSDoc and implementation for
parseTemporalDatedon’t currently line up:
- Doc says it “Returns an ISO 8601 date string or null if invalid”, but the function:
- Returns
undefinedwhen!dateStr.- Returns the original
dateStrwhen parsing fails (so not necessarily ISO).- Callers like
toDbDateandvalidateDateRangeimplicitly treat “unparseable” asundefined/Invalid Date, whiletoGitDateis fine with git-native strings.This makes the effective contract subtle and easy to misread from the JSDoc.
Consider tightening this up by either:
- Updating the JSDoc and return type to something like “ISO 8601 when parsed, otherwise the original string, or
undefinedwhen input is falsy”, or- Returning
undefinedfor unparseable inputs and lettingtoGitDateown the “fallback to original for git” behavior.Also,
relativePatternsis currently unused and can be removed or wired into the parsing logic to avoid dead code.Also applies to: 35-41, 99-102, 144-156
packages/web/src/features/search/gitApi.test.ts (1)
11-19: Align error-handling tests with productionsewbehaviorIn tests you mock
sewas a simple passthrough:vi.mock('@/actions', () => ({ sew: (fn: () => any) => fn(), }));and later assert that
searchCommitsrejects for “other Error instances” and non-Error exceptions. In production,searchCommitsis wrapped by the realsewfromactions.ts, which catches thrown errors and returns aServiceErrorinstead of letting the promise reject.That means the “throws” behavior in these tests doesn’t match what callers will see at runtime. To avoid surprises, consider either:
- Asserting on a
ServiceErrorshape for these cases (matching realsew), or- Removing the
sewdependency fromsearchCommitsand letting routes/actions wrap it, if you explicitly wantsearchCommitsitself to throw.Either way, having tests reflect the actual public contract of
searchCommitswill make future changes safer.Also applies to: 287-347
packages/web/src/actions.ts (1)
463-541: HandleServiceErrorresults fromsearchCommitsexplicitly in activity filteringIn the activity-based branch of
getRepos:const activityChecks = await Promise.all(repos.map(async (repo) => { try { const commits = await searchCommits({ ... }); if (Array.isArray(commits) && commits.length > 0) { return repo; } } catch (e) { // ... } return null; }));
searchCommitsis already wrapped insew(ingitApi.ts), so for git/log failures it resolves to aServiceErrorrather than throwing. As a result:
- The
catchblock here will rarely, if ever, run for git-related issues.- Cases like “repo not on disk”, “invalid date range”, or git timeouts will show up as a
ServiceErrorvalue, get treated as “no activity”, and be silently filtered out with no logging.If you want
list_reposwithactiveAfter/activeBeforeto surface date-range validation failures instead of just returning an empty list, and to log unexpected git errors, consider branching explicitly on the union:const result = await searchCommits({ repoId: repo.id, since: activeAfter, until: activeBefore, maxCount: 1, }); if (Array.isArray(result)) { return result.length > 0 ? repo : null; } // `result` is a ServiceError here const message = result.message ?? ''; if (!message.includes('does not exist')) { console.error( `Error checking activity for repo ${repo.id} (${repo.name}):`, result, ); } return null;This keeps “repo not yet cloned” behaving as “no activity” while making invalid ranges and other unexpected failures visible.
Separately, this wiring introduces a circular dependency:
actions.tsimportssearchCommitsfrom./features/search/gitApi, andgitApi.tsimportssewfrom@/actions. It works today becausesewis only used at call time, but it’d be cleaner long term to extractsewto a small shared module (or inject it) to avoid subtle initialization issues in future refactors.packages/web/src/features/search/gitApi.ts (1)
1-22:searchCommitsimplementation looks solid; consider small cleanupsOverall, this module cleanly encapsulates git-based commit search:
- Validates the repo path and date range up front.
- Uses
toGitDateto preserve git-native formats when possible.- Configures
simpleGitwith a sensible timeout and concurrency.- Provides clear, user-facing messages for common git failures (invalid repo, ambiguous args, timeout), with a generic fallback.
A couple of minor nits you might consider:
- You check
existsSync(repoPath)both before callingcreateGitClientForPathand inside that helper. One of those checks could be dropped unless you specifically want the second check to catch races and produce a different message.- In the catch block you return
unexpectedError(...)for known cases but throw for unknown ones, relying onsewto convert those into genericServiceErrors. That’s fine, but it means callers always seeServiceError(never a thrown error) in production, so other code (likegetRepos) should treat the result asCommit[] | ServiceErrorrather than depending on exceptions.These are non-blocking; the core behavior and error mapping look good.
Also applies to: 58-150
packages/mcp/src/index.ts (1)
233-347:list_reposclient‑side filtering/pagination looks good; confirmlistReposreturns the full filtered set.The flow of: call
listRepos({ activeAfter, activeBefore }), then apply name filtering, sort, computetotalCount, and paginate on the MCP side cleanly fixes the “take before filter” bug, and the new empty/out‑of‑range page messages (including the temporal‑filtering note) are clear.This does rely on
listReposreturning the complete set of repositories for the given temporal filters (i.e., not already paginated). If the underlying API still pages results,totalCount,totalPages, and the “Try pageNumber between 1 and X” guidance will all be based only on that first backend page.If
listReposis still paginated underneath, consider either:
- Fetching all pages when temporal filters are used, or
- Threading backend pagination parameters and totals through to this tool instead of recomputing them locally.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
packages/backend/src/constants.ts(2 hunks)packages/mcp/CHANGELOG.md(1 hunks)packages/mcp/README.md(4 hunks)packages/mcp/src/client.ts(3 hunks)packages/mcp/src/index.ts(6 hunks)packages/mcp/src/schemas.ts(1 hunks)packages/mcp/src/types.ts(2 hunks)packages/shared/src/constants.ts(1 hunks)packages/web/src/actions.ts(2 hunks)packages/web/src/app/api/(server)/commits/route.ts(1 hunks)packages/web/src/app/api/(server)/repos/route.ts(1 hunks)packages/web/src/features/search/dateUtils.test.ts(1 hunks)packages/web/src/features/search/dateUtils.ts(1 hunks)packages/web/src/features/search/gitApi.test.ts(1 hunks)packages/web/src/features/search/gitApi.ts(1 hunks)packages/web/src/features/search/schemas.ts(1 hunks)packages/web/src/features/search/searchApi.test.ts(1 hunks)packages/web/src/features/search/searchApi.ts(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Test Web
packages/shared/src/constants.ts
[error] 37-37: TypeError: The "path" argument must be of type string. Received undefined
🔇 Additional comments (9)
packages/web/src/features/search/searchApi.test.ts (1)
33-323: Temporal coverage in this suite looks solidThe tests exercise ISO dates, relative phrases, missing parameters, repository filtering, and inverted ranges with fixed system time, which should give good confidence that temporal parameters are wired correctly into query construction and range validation (via
dateUtils).packages/web/src/features/search/dateUtils.test.ts (1)
9-378: Thorough temporal parsing and range validation testsThe suite covers a wide range of ISO/relative formats, boundary conditions (month/year/leap year, midnight/end-of-day), and integration scenarios across
parseTemporalDate,validateDateRange,toDbDate, andtoGitDate. That level of coverage should make regressions in date handling unlikely.packages/backend/src/constants.ts (1)
11-11: Re-exporting cache-dir constants from shared module looks appropriateRe-exporting
REPOS_CACHE_DIRandINDEX_CACHE_DIRfrom@sourcebot/sharedkeeps the backend public surface unchanged while centralizing the actual definitions in one place, which should reduce divergence between services.packages/mcp/src/types.ts (1)
13-15: Commit-search types are correctly wired into the public APIThe new
SearchCommitsRequest,SearchCommitsResponse, andCommitexports cleanly mirror the schemas and extend the MCP surface without breaking existing callers. Just keep this file in sync with the corresponding web types per the header note.Also applies to: 36-38
packages/mcp/README.md (1)
169-205: Temporal docs align well with the new APIsThe additions for
search_code(gitRevision,since,until,includeDeletedFiles),list_repos(activeAfter/activeBefore), and the newsearch_commitssection clearly describe index‑time vs commit‑time semantics and supported date formats. These explanations match the corresponding schemas and server logic and should make the new temporal behavior much easier to understand for MCP clients.Also applies to: 221-253, 243-253
packages/web/src/actions.ts (1)
315-321: API key creation guard for non-owner users looks correctThe new check against
EXPERIMENT_DISABLE_API_KEY_CREATION_FOR_NON_ADMIN_USERSwith an earlyFORBIDDENServiceErrorand clear log message is consistent with the rest of the file’s patterns and should be safe to ship.packages/mcp/CHANGELOG.md (1)
10-29: Changelog entries clearly capture the new temporal MCP capabilitiesThe “Unreleased” section accurately reflects the added tools/parameters, temporal format support, error handling, and the list_repos pagination fix, and clearly distinguishes index time vs commit time. This is consistent with the PR’s behavior and keeps consumers informed.
packages/mcp/src/client.ts (1)
2-4: MCP client wiring for temporal repos andsearchCommitsis consistent and safeThe updated
listRepossignature (optionalparamswithactiveAfter/activeBefore) and the newsearchCommitsclient:
- Follow the same fetch +
isServiceError+ schema-parse pattern as the existing methods.- Remain backwards compatible for existing
listRepos()callers.- Correctly propagate org domain and optional API key headers.
No issues from the client side.
Also applies to: 24-47, 67-83
packages/mcp/src/index.ts (1)
52-67: Temporal params forsearch_codeare plumbed correctly; just confirm index‑time semantics.The new
gitRevision,since,until, andincludeDeletedFilesfields are optional, preserve existing query composition, and are passed through tosearch()without breaking current callers. Please double‑check thatsearch({ since, until })actually applies the index‑time filters described here; if it instead uses commit time, consider aligning the implementation or updating the descriptions so it doesn’t conflict conceptually withsearch_commitsandlist_repos.Also applies to: 80-83, 103-106
| import escapeStringRegexp from 'escape-string-regexp'; | ||
| import { z } from 'zod'; | ||
| import { listRepos, search, getFileSource } from './client.js'; | ||
| import { listRepos, search, getFileSource, searchCommits } from './client.js'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_commits.repoId is typed as a number, which likely conflicts with string‑typed repo IDs elsewhere.
Other tools appear to treat repository identifiers as strings (e.g., search_code.filterByRepoIds: string[], get_file_source.repoId: string, and list_repos printing id: ${repo.repoName}), but search_commits currently defines repoId as z.number(). That mismatch may make it awkward or error‑prone for clients to take the ID from list_repos and pass it directly into search_commits, and may not match what searchCommits expects.
I’d recommend:
- repoId: z.number().describe(`The ID of the repository to search in. Obtain this by calling 'list_repos' first.`),
+ repoId: z.string().describe(`The Sourcebot repository ID to search in. Obtain this by calling 'list_repos' first.`),and updating the searchCommits argument type to accept the same string ID shape used by the other tools.
As a small polish, if the real default for maxCount is 50, you could also encode that in the schema (e.g., .int().positive().default(50)) so validation and docs stay in sync.
Also applies to: 191-231
🤖 Prompt for AI Agents
In packages/mcp/src/index.ts around lines 8 and also 191-231, change the
search_commits schema and related types so repoId is a string (not z.number())
to match other APIs (e.g., search_code.filterByRepoIds and get_file_source),
update the searchCommits function/signature to accept the string ID shape used
elsewhere, and ensure callers pass the repo id from list_repos directly;
additionally, set the maxCount schema to .int().positive().default(50) (or the
real default) so validation/docs match the actual default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved in b30bbd3, dual support with repo id and name
@brendan-kellam the current list_repos return id which is repo name, don't want to change that field to be numeric id and add a new repo name field which might cause inconsistent issue.
What do you think?
packages/web/src/actions.ts
Outdated
| } = {}) => sew(() => | ||
| withOptionalAuthV2(async ({ org, prisma }) => { | ||
| // When filtering by activity, we need to fetch all repos first, | ||
| // then apply pagination after filtering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we aren't really doing pagination here with take
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for clarify that.
My implementation with remove the limit to load as much as possible might cause server side performance issue, so I'll update it also with combine the take at end, but batching load first and set hard maximum limit on total.
36518e4 to
22bf69e
Compare
|
@coderabbitai please review again |
|
@waynesun09, I'll review the changes again for you. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (12)
packages/mcp/README.md (1)
220-252: Verify that all documented relative date formats are actually supportedThe Date Format Examples list
"last month"in addition to"30 days ago","1 week ago","last week","yesterday". Please confirmparseTemporalDate(and any schema refinements) really accept"last month"; if not, either add support or trim the docs to only list supported forms.packages/shared/src/index.server.ts (1)
29-32: Server cache directory constants re-export looks goodRe-exporting
REPOS_CACHE_DIRandINDEX_CACHE_DIRhere keeps server-only constants centralized. Just ensure these exports are only imported from server-side code paths to avoid accidental bundling into client bundles.CHANGELOG.md (1)
12-14: Temporal filtering changelog entry is accurate; consider optionally mentioning commit searchThe new “temporal filtering” bullet correctly captures branch/revision and index date filtering plus relative date support. If you want full traceability from the root changelog, you might also briefly call out the new commit-time search surface (
searchCommits), but that’s optional since MCP-level changes are already documented separately.packages/shared/src/constants.server.ts (1)
1-7: Env-aware cache directory constants are reasonableUsing
env.DATA_CACHE_DIRwith/tmp/sourcebot/...fallbacks is a pragmatic approach and the guard avoids load-time failures when the env var is unset. If you ever want to DRY this up, you could factor out a small helper for the base cache dir, but it’s not necessary.packages/web/src/features/search/searchApi.ts (3)
27-62: Search/streamSearch integration of permission, temporal, and branch filters looks coherentUsing
getAccessibleRepoNamesForUserplusgetTemporallyFilteredRepos, then intersecting viacombineRepoFilters, gives a clear repoSearchScope that respects both permissions and index-time constraints. ApplyingapplyBranchFilteron theQueryIRwhengitRevisionis present is a sensible way to express branch selection in a Zoekt-friendly IR.From a behavior standpoint, this matches the documented semantics:
- temporal filters operate at the repo (indexedAt) level, not commit time, and
- branch/revision filtering is orthogonal via
gitRevision.No correctness issues spotted here.
105-194: Temporal repo filter helpers are correct; small refinements are optional
getTemporallyFilteredReposandcombineRepoFiltersdo the right thing:
- They only apply temporal filtering when
sinceoruntilare provided.- They use
toDbDateto normalize dates and gracefully skip bounds that resolve toundefined.- They intersect permission-based and temporal repo sets when both are present, returning
undefinedwhen no filters apply (preserving prior unfiltered behavior).Two optional refinements you might consider (not blockers):
- If schema/dateUtils don’t already enforce it, ensure
since/untilare validated (includingsince <= until) before reaching this helper, so we never silently treat an invalid date string as “no temporal filtering”.- If you care about minimizing DB round trips, a future refactor could merge permission and temporal conditions into a single
findManycall instead of two separate queries.Overall, the helper logic is sound.
196-214: Branch filter wrapper aligns with a composable QueryIR design
applyBranchFilterwrapping the existingQueryIRinside anandnode with abranchconstraint is a clean, composable approach. It should play nicely even if the incoming IR already contains other filters.If there are any special cases for
gitRevisionvalues (e.g., a wildcard such as"*"meaning “all branches”), double-check that callers either normalize those before reaching this function or handle them separately, since this helper currently always applies an exact branch match.packages/web/src/features/search/gitApi.test.ts (1)
98-172: Consider centralizing and restoring dateUtils spies for test robustnessYou call
vi.spyOn(dateUtils, 'validateDateRange')/toGitDatein multiple tests and suites while only usingvi.clearAllMocks()in the top-levelbeforeEach. To avoid any subtle interactions or “already spied” issues as Vitest/Jest evolve, it’d be a bit more robust to either:
- Create the spies once in a
beforeAll/outerbeforeEachand just adjustmockReturnValue/mockImplementationper test, or- Add
vi.restoreAllMocks()in anafterEachso eachspyOnstarts from the real implementation.Not urgent, but it can make future test changes safer.
Also applies to: 239-271, 317-383
packages/mcp/src/schemas.ts (1)
200-219: Avoid future drift by sharing commit search shapes between web and MCP
searchCommitsRequestSchema,commitSchema, andsearchCommitsResponseSchemamatch the web types and backendCommitinterface today, but they’re now defined in three places. When convenient, consider moving these schemas/types into a small shared package (or exporting from a single canonical module) and importing them here and inweb/gitApi.tsto reduce the risk of subtle mismatches over time.packages/web/src/features/search/gitApi.ts (1)
61-78: Unify commit search types with shared schemas to reduce duplication
SearchCommitsRequestandCommithere duplicate the shapes already expressed inweb/src/features/search/types.tsandmcp/src/schemas.ts. To keep things consistent and automatically pick up schema changes (likemaxCountbounds), consider:
- Importing
SearchCommitsRequestandCommitfrom the shared Zod-based types, or- Exporting a single canonical TS type for these and reusing it in all three locations.
This is mainly a maintainability tweak; behavior is fine as-is.
Also applies to: 200-219
packages/mcp/src/index.ts (2)
202-202: Align schema default with documentation.The description states "default: 50" but the schema doesn't enforce this default with
.default(50). IfmaxCountis omitted, it will beundefined, requiring the backend to handle the default.Consider adding
.default(50)to the schema for consistency:- maxCount: z.number().describe(`Maximum number of commits to return (default: 50).`).optional(), + maxCount: z.number().int().positive().default(50).describe(`Maximum number of commits to return (default: 50).`),This ensures the schema validates as an integer, is positive, and matches the documented default behavior.
298-303: Consider including indexedAt in output when temporal filtering is applied.When users filter repositories by
activeAfteroractiveBefore, they might want to see when each repository was last indexed to understand the filtering results. Currently, the output only includesidandurl.Consider conditionally including the
lastIndexedorindexedAtfield when temporal parameters are provided:const content: TextContent[] = paginated.map(repo => { let text = `id: ${repo.repoName}\nurl: ${repo.webUrl}`; if ((activeAfter || activeBefore) && repo.indexedAt) { text += `\nlast_indexed: ${repo.indexedAt}`; } return { type: "text", text, } });This would help users verify that the temporal filtering is working as expected.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
CHANGELOG.md(1 hunks)packages/mcp/CHANGELOG.md(1 hunks)packages/mcp/README.md(4 hunks)packages/mcp/src/client.ts(3 hunks)packages/mcp/src/index.ts(5 hunks)packages/mcp/src/schemas.ts(4 hunks)packages/mcp/src/types.ts(2 hunks)packages/shared/src/constants.server.ts(1 hunks)packages/shared/src/index.server.ts(1 hunks)packages/web/src/actions.ts(1 hunks)packages/web/src/app/api/(server)/commits/route.ts(1 hunks)packages/web/src/app/api/(server)/repos/route.ts(1 hunks)packages/web/src/features/search/dateUtils.test.ts(1 hunks)packages/web/src/features/search/dateUtils.ts(1 hunks)packages/web/src/features/search/gitApi.test.ts(1 hunks)packages/web/src/features/search/gitApi.ts(1 hunks)packages/web/src/features/search/searchApi.ts(4 hunks)packages/web/src/features/search/types.ts(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- packages/mcp/src/types.ts
- packages/mcp/src/client.ts
- packages/web/src/features/search/dateUtils.ts
- packages/web/src/actions.ts
🧰 Additional context used
🧬 Code graph analysis (7)
packages/web/src/app/api/(server)/repos/route.ts (3)
packages/web/src/lib/types.ts (1)
GetReposResponse(32-32)packages/web/src/actions.ts (1)
getRepos(467-511)packages/web/src/app/api/(client)/client.ts (1)
getRepos(53-62)
packages/shared/src/constants.server.ts (1)
packages/shared/src/index.server.ts (3)
REPOS_CACHE_DIR(30-30)env(34-34)INDEX_CACHE_DIR(31-31)
packages/web/src/features/search/gitApi.test.ts (1)
packages/web/src/features/search/gitApi.ts (1)
searchCommits(98-197)
packages/web/src/features/search/searchApi.ts (3)
packages/web/src/withAuthV2.ts (1)
withOptionalAuthV2(43-67)packages/web/src/features/search/parser.ts (1)
parseQuerySyntaxIntoIR(55-108)packages/web/src/features/search/dateUtils.ts (1)
toDbDate(143-155)
packages/web/src/features/search/dateUtils.test.ts (1)
packages/web/src/features/search/dateUtils.ts (4)
parseTemporalDate(21-101)validateDateRange(110-134)toDbDate(143-155)toGitDate(164-186)
packages/mcp/src/index.ts (3)
packages/mcp/src/client.ts (2)
searchCommits(67-83)listRepos(24-47)packages/web/src/features/search/gitApi.ts (1)
searchCommits(98-197)packages/web/src/lib/utils.ts (1)
isServiceError(430-436)
packages/web/src/features/search/gitApi.ts (7)
packages/mcp/src/types.ts (2)
ServiceError(33-33)SearchCommitsRequest(35-35)packages/web/src/lib/serviceError.ts (1)
unexpectedError(75-81)packages/web/src/features/search/types.ts (1)
SearchCommitsRequest(178-178)packages/web/src/actions.ts (1)
sew(44-61)packages/web/src/withAuthV2.ts (1)
withOptionalAuthV2(43-67)packages/shared/src/constants.server.ts (1)
REPOS_CACHE_DIR(6-6)packages/web/src/features/search/dateUtils.ts (2)
validateDateRange(110-134)toGitDate(164-186)
🔇 Additional comments (15)
packages/mcp/README.md (1)
169-205: Temporal filtering docs forsearch_code/list_reposlook consistentThe distinction between index time (
search_code,list_repos) and commit time (search_commits) is clearly documented and matches the intended behavior. No issues from a tooling/UX perspective.packages/mcp/CHANGELOG.md (1)
10-25: Changelog entry is clear and matches the implementation scopeThe Unreleased “Added” bullets accurately summarize the new temporal features, commit search tool, validation, timeouts, and tests. This will be very helpful for downstream consumers of
@sourcebot/mcp.packages/web/src/app/api/(server)/repos/route.ts (1)
5-16: GET handler wiring foractiveAfter/activeBeforelooks correctReading temporal filters from
request.nextUrl.searchParamsand passing them through togetRepospreserves existing behavior when absent and enables index-time filtering when present. AssuminggetRepos/toDbDatehandle malformed values appropriately (either via validation or safe fallback), this route looks good.packages/web/src/app/api/(server)/commits/route.ts (1)
1-24: Commit search POST route is straightforward and aligned with the schemaThe handler cleanly validates via
searchCommitsRequestSchema.safeParseAsync, delegates tosearchCommits, and consistently converts both validation and service errors intoserviceErrorResponse. This matches the documentedsearch_commitsbehavior.packages/web/src/features/search/types.ts (2)
86-95: Index / branch-filter options look consistent and well-documentedThe added
gitRevision,since,until, andisBranchFilteringEnabledfields are coherent with the PR’s temporal story and clearly documented as index-time filters here. No changes needed from this file.Also applies to: 104-110
170-178: Commit search request schema is well-shapedThe
searchCommitsRequestSchema(including themaxCountbounds) matches the backend/mcp usage and gives a solid contract for the newsearch_commitstool. Looks good as-is.packages/web/src/features/search/gitApi.test.ts (1)
1-68: Thorough coverage of commit search behaviorThe mocking setup plus the success/error/integration and repo-resolution cases exercise essentially every branch in
searchCommits, including disk checks, date handling, git options, and error mapping. This gives strong confidence in the new API surface.Also applies to: 273-574
packages/web/src/features/search/dateUtils.test.ts (1)
1-379: Deterministic and comprehensive date utility testsThe combination of fake timers with broad coverage of ISO, relative phrases, edge cases, and integration scenarios gives very solid protection around
parseTemporalDate/validateDateRange/toDbDate/toGitDate. No changes needed here.packages/mcp/src/schemas.ts (1)
24-33: Search options / response schema stay in sync with web typesThe added temporal fields and
isBranchFilteringEnabledflag correctly mirrorpackages/web/src/features/search/types.ts, honoring the “keep in sync” note and keeping MCP contracts aligned with the web API.Also applies to: 120-149
packages/web/src/features/search/gitApi.ts (2)
12-21: Repo resolution and git client setup look correct
createGitClientForPathandresolveRepoIdwiresimple-gitand Prisma together cleanly: numeric IDs short-circuit, string names are scoped byorgId, and the not-found path returns a clearunexpectedErrorwith guidance to uselist_repos. No functional issues here.Also applies to: 32-59
80-197: Commit search flow and error handling are solidThe
searchCommitsimplementation cleanly composessew+withOptionalAuthV2, validates repo existence, usesvalidateDateRange/toGitDatebefore constructinggit.logoptions, and maps common git failures (not a git repository,ambiguous argument,timeout) to actionableServiceErrors while letting unexpected issues bubble tosew. Together with the tests, this looks robust.packages/mcp/src/index.ts (4)
52-63: LGTM! Clear distinction between index time and commit time filtering.The addition of
gitRevision,since, anduntilparameters is well-documented. The explicit note "(NOT by commit time)" in the descriptions ofsinceanduntilhelps prevent confusion with thesearch_commitstool.
76-78: LGTM! Parameters correctly threaded through to search function.The new temporal parameters are properly destructured from the input and passed to the
searchfunction, maintaining consistency with the existing parameter handling pattern.Also applies to: 94-96
204-212: LGTM! Clean parameter pass-through to searchCommits.The handler correctly passes all parameters directly to the
searchCommitsfunction, with appropriate error handling in the following lines.
269-269: LGTM! Temporal parameters correctly passed to backend.The
activeAfterandactiveBeforeparameters are properly passed to thelistReposclient function for server-side filtering by index time.
22bf69e to
552a4e9
Compare
|
@coderabbitai please review again |
|
@waynesun09, I'll review the changes for you. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
packages/mcp/README.md (1)
169-205: Temporal docs forsearch_code/list_reposare clear; consider clarifying semanticsThe distinction that these filters operate on index time (not commit time) and accept ISO or relative formats is well documented. If you want to avoid ambiguity, consider briefly stating whether bounds are inclusive and which timezone the server uses to interpret relative phrases (e.g., “30 days ago” evaluated in server-local time vs UTC).
packages/web/src/features/search/dateUtils.ts (3)
21-75:parseTemporalDatecovers key formats; minor robustness improvements possibleThe ISO-first parsing and explicit handling of “yesterday” /
N <unit>s agolook good and match the documented relative formats. To tighten things up:
- Use an explicit radix in
parseInt(matchRelative[1], 10)to avoid historicalparseIntfootguns.- Consider documenting that “last week/month/year” and
N <unit>s agoare evaluated relative to the current timestamp (not normalized to start-of-week/month/year), so behavior is clear to callers.
110-155: Date range validation and DB conversion behave reasonably; consider surfacing invalid inputs
validateDateRangeandtoDbDatecorrectly useparseTemporalDateand gracefully handle unparsable values by returningnull/undefined. One tradeoff is that a user-supplied but invalidsince/untilsilently degrades to “no bound” instead of producing a clear error. If you want stricter UX for index-time filters (e.g.,search_code,list_repos), consider havingvalidateDateRangereturn an error string when either side is unparsable, and wiring that into the HTTP/MCP response for those call sites (similar tosearch_commits).
164-185:toGitDatepreserves git-native formats; keep patterns in sync with parserThe whitelist in
gitNativeFormatscorrectly preserves relative and ISO forms that Git already understands and falls back toparseTemporalDateotherwise. To avoid future drift, consider centralizing these regexes (or a shared enum of supported relative tokens) withparseTemporalDateso adding a new relative form only requires updating one place.packages/web/src/features/search/gitApi.test.ts (2)
5-49: Mock setup cleanly isolatessearchCommits; watch for future module-surface changesThe targeted mocks for
simple-git,fs,@sourcebot/shared,@/lib/serviceError,@/actions,@/withAuthV2, and@/lib/utilsgive you good control over dependencies and error paths. If@sourcebot/shared’s surface grows andgitApistarts importing additional exports from it, this full-module mock will need updating—at that point, consider usingvi.importActualand overriding onlyREPOS_CACHE_DIRto reduce maintenance.
71-96: Repository-path assertions are OS-specific; consider usingpath.joinin expectationsThe tests asserting
existsSyncis called with'/mock/cache/dir/456'(and similar) assume POSIX-style separators, butgitApiusespath.join, which will produce\-separated paths on Windows. To keep tests OS-agnostic, you could either:
- Import
pathand assert againstpath.join('/mock/cache/dir', '456'), or- Use
expect.stringContaining('/mock/cache/dir')+ repo ID, if exact separators aren’t important.Also applies to: 456-472
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
CHANGELOG.md(1 hunks)packages/mcp/CHANGELOG.md(1 hunks)packages/mcp/README.md(4 hunks)packages/mcp/src/client.ts(3 hunks)packages/mcp/src/index.ts(6 hunks)packages/mcp/src/schemas.ts(4 hunks)packages/mcp/src/types.ts(2 hunks)packages/shared/src/constants.server.ts(1 hunks)packages/shared/src/index.server.ts(1 hunks)packages/web/src/actions.ts(1 hunks)packages/web/src/app/api/(server)/commits/route.ts(1 hunks)packages/web/src/app/api/(server)/repos/route.ts(1 hunks)packages/web/src/features/search/dateUtils.test.ts(1 hunks)packages/web/src/features/search/dateUtils.ts(1 hunks)packages/web/src/features/search/gitApi.test.ts(1 hunks)packages/web/src/features/search/gitApi.ts(1 hunks)packages/web/src/features/search/searchApi.ts(4 hunks)packages/web/src/features/search/types.ts(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- packages/web/src/features/search/searchApi.ts
🚧 Files skipped from review as they are similar to previous changes (7)
- packages/web/src/actions.ts
- packages/shared/src/constants.server.ts
- packages/web/src/features/search/dateUtils.test.ts
- packages/mcp/src/types.ts
- packages/web/src/app/api/(server)/repos/route.ts
- packages/web/src/app/api/(server)/commits/route.ts
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (3)
packages/mcp/src/schemas.ts (1)
packages/web/src/features/search/types.ts (1)
searchCommitsRequestSchema(170-177)
packages/mcp/src/client.ts (5)
packages/mcp/src/types.ts (4)
ListRepositoriesResponse(26-26)ServiceError(33-33)SearchCommitsRequest(35-35)SearchCommitsResponse(36-36)packages/mcp/src/env.ts (1)
env(8-25)packages/web/src/features/search/types.ts (1)
SearchCommitsRequest(178-178)packages/mcp/src/utils.ts (1)
isServiceError(4-10)packages/mcp/src/schemas.ts (1)
searchCommitsResponseSchema(219-219)
packages/web/src/features/search/gitApi.test.ts (1)
packages/web/src/features/search/gitApi.ts (1)
searchCommits(98-197)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (24)
packages/shared/src/index.server.ts (1)
29-32: Re-exporting server cache dir constants via shared index looks goodThe new
REPOS_CACHE_DIR/INDEX_CACHE_DIRre-exports are consistent with the rest of this module and give consumers a single canonical import path; no issues spotted.packages/mcp/README.md (1)
220-252:search_commitsand date-format documentation align well with the implementationThe
search_commitsdescription (commit-time filtering, repoId as numeric ID or full repo name, cloning requirement) and the Date Format Examples block match the behavior in the git API and date utilities, and clearly differentiate index-time vs commit-time semantics.packages/mcp/CHANGELOG.md (1)
10-24: Changelog entries comprehensively capture the new temporal toolingThe Unreleased “Added” section clearly lists temporal parameters,
search_commits, relative date support, git timeouts, and the associated tests/docs, which should make this feature set easy to track in release notes.packages/web/src/features/search/gitApi.test.ts (1)
98-383: Test coverage forsearchCommitsoptions, errors, and repo resolution is excellentThe suite thoroughly exercises date-range validation,
toGitDatewiring, git log options (since/until/author/query/maxCount), success paths (non-empty and empty), detailed error mapping (not-a-git-repo, ambiguous argument, timeout, generic errors), disk presence, git client config, and numeric vs string repo identifiers. This should give high confidence in the new temporal commit-search behavior.Also applies to: 418-574
packages/web/src/features/search/types.ts (3)
92-94: LGTM! Clear documentation of indexed-time vs. commit-time filtering.The comments effectively distinguish indexed date filtering from commit-time filtering, which will help users choose the right tool for their needs.
109-109: LGTM!The optional boolean field is appropriately typed and documented.
170-177: LGTM! Schema validation is robust.The maxCount validation properly constrains the value (int, positive, max 500), and the dual repoId support (number | string) aligns with the MCP schemas.
packages/mcp/src/client.ts (3)
2-3: LGTM!The imports are correctly updated to support the new searchCommits functionality.
24-31: LGTM! Clean parameter handling.The optional params object is properly typed and query parameters are safely constructed using URLSearchParams.
67-83: LGTM! Consistent implementation pattern.The searchCommits function follows the same structure and error handling as the other client functions, ensuring consistency across the API surface.
packages/mcp/src/schemas.ts (4)
1-1: LGTM!The updated sync reference correctly points to the types.ts file.
30-32: LGTM! Consistent with web layer schemas.The searchOptionsSchema extensions match the definitions in packages/web/src/features/search/types.ts, maintaining consistency across the codebase.
148-148: LGTM!The isBranchFilteringEnabled field is consistent with the web layer schema.
200-219: LGTM! Well-structured commit schemas.The searchCommitsRequestSchema is consistent with the web layer, and the commitSchema appropriately captures all essential commit metadata fields.
packages/mcp/src/index.ts (5)
8-8: LGTM!The searchCommits import is correctly added to support the new tool.
52-97: LGTM! Clear parameter documentation and proper forwarding.The new temporal parameters are well-documented, correctly emphasizing that since/until filter by indexed date (not commit time), and are properly passed through to the underlying search function.
181-225: LGTM! Comprehensive implementation addressing previous feedback.The search_commits tool properly supports dual repoId types (number | string) as discussed in previous reviews, clearly distinguishes commit-time filtering from index-time filtering, and includes helpful guidance about repository cloning requirements.
227-236: LGTM! Excellent clarification of temporal filtering semantics.The documentation clearly distinguishes indexed-time filtering (indexedAt) from commit-time filtering and directs users to the appropriate tool for each use case.
237-329: LGTM! Thoughtful conditional metadata inclusion.The temporal filtering parameters are properly integrated, and the lastIndexed field is only shown when temporal filters are active, keeping the output concise for non-temporal queries.
packages/web/src/features/search/gitApi.ts (5)
12-21: LGTM! Appropriate git client configuration.The 30-second timeout and concurrent process limit (6) provide good resource management and prevent hanging operations.
32-59: LGTM! Secure repository resolution with org scoping.The function properly handles dual identifier types and importantly scopes the lookup by orgId, ensuring users can only access repositories in their organization.
61-78: LGTM! Well-defined interfaces.The interfaces accurately represent the commit search contract and align with the schemas defined elsewhere in the codebase.
98-134: LGTM! Comprehensive validation and secure path handling.The function properly implements auth guards (withOptionalAuthV2), validates inputs, and constructs repository paths securely. The error messages provide helpful guidance about repository cloning status.
135-197: LGTM! Robust error handling with actionable messages.The git log options are correctly constructed, and the error handling covers common failure modes (invalid repo, ambiguous dates, timeout) with specific, helpful error messages. The type assertion at line 160 is reasonable given simple-git's output structure.
Add temporal filtering capabilities for searches by git branch/revision and repository index dates (since/until). Integrates with the refactored QueryIR-based search architecture. - Add gitRevision, since, until parameters to SearchOptions - Implement temporal repo filtering by indexedAt field - Add branch filtering via QueryIR wrapper - Add search_commits MCP tool for commit-based searches - Update list_repos with activeAfter/activeBefore filtering - Add 88 new tests (all passing) Signed-off-by: Wayne Sun <gsun@redhat.com>
552a4e9 to
310e74a
Compare
Summary
Add comprehensive temporal filtering capabilities to MCP tools with new
search_commitstool, enhancedlist_reposactivity filtering, and improvedsearch_codetemporal support.Changes
New Features
activeAfter/activeBeforeparameters for commit activity filteringgitRevision,since/until, andincludeDeletedFilesparametersBug Fixes
list_repospagination bug wheretakelimit was applied before activity filtering, returning fewer results than requestedTesting & Documentation
Test Results
All 106 new tests pass:
Breaking Changes
None. All changes are backward compatible with optional parameters.
Resolves
Closes #511
Checklist
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.