You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add temporal filtering to search and repository APIs
Add temporal filtering capabilities for searches by git branch/revision
and repository index dates (since/until). Integrates with the refactored
QueryIR-based search architecture.
- Add gitRevision, since, until parameters to SearchOptions
- Implement temporal repo filtering by indexedAt field
- Add branch filtering via QueryIR wrapper
- Add search_commits MCP tool for commit-based searches
- Update list_repos with activeAfter/activeBefore filtering
- Add 88 new tests (all passing)
Signed-off-by: Wayne Sun <gsun@redhat.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
10
10
<!-- Bump @sourcebot/mcp since there are breaking changes to the api in this release -->
11
11
12
12
### Added
13
+
- Added temporal filtering to search and repository APIs with support for git branch/revision filtering and repository index date filtering (since/until parameters). Supports both ISO 8601 and relative date formats (e.g., "30 days ago", "last week").
13
14
- Added support for streaming code search results. [#623](https://github.com/sourcebot-dev/sourcebot/pull/623)
14
15
- Added buttons to toggle case sensitivity and regex patterns. [#623](https://github.com/sourcebot-dev/sourcebot/pull/623)
15
16
- Added counts to members, requets, and invites tabs in the members settings. [#621](https://github.com/sourcebot-dev/sourcebot/pull/621)
Copy file name to clipboardExpand all lines: packages/mcp/CHANGELOG.md
+16Lines changed: 16 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
8
8
## [Unreleased]
9
9
10
+
### Added
11
+
- Added comprehensive relative date support for all temporal parameters (e.g., "30 days ago", "last week", "yesterday")
12
+
- Added `search_commits` tool to search commits by actual commit time with full temporal filtering. Accepts both numeric database IDs (e.g., 123) and string repository names (e.g., "github.com/owner/repo") for the `repoId` parameter, allowing direct use of repository names from `list_repos` output
13
+
- Added `since`/`until` parameters to `search_code` (filters by index time - when Sourcebot indexed the repo)
14
+
- Added `gitRevision` parameter to `search_code`
15
+
- Added `activeAfter`/`activeBefore` parameters to `list_repos` (filters by index time - when Sourcebot indexed the repo)
16
+
- Added date range validation to prevent invalid date ranges (since > until)
17
+
- Added 30-second timeout for git operations to handle large repositories
18
+
- Added enhanced error messages for git operations (timeout, repository not found, invalid git repository, ambiguous arguments)
19
+
- Added clarification that repositories must be cloned on Sourcebot server disk for `search_commits` to work
20
+
- Added comprehensive temporal parameter documentation to README with clear distinction between index time and commit time filtering
21
+
- Added comprehensive unit tests for date parsing utilities (90+ test cases)
22
+
- Added unit tests for git commit search functionality with mocking
23
+
- Added integration tests for temporal parameter validation
24
+
- Added unit tests for repository identifier resolution (both string and number types)
25
+
10
26
### Changed
11
27
- Updated API client to match the latest Sourcebot release. [#555](https://github.com/sourcebot-dev/sourcebot/pull/555)
Copy file name to clipboardExpand all lines: packages/mcp/README.md
+47-5Lines changed: 47 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -166,6 +166,8 @@ For a more detailed guide, checkout [the docs](https://docs.sourcebot.dev/docs/f
166
166
167
167
Fetches code that matches the provided regex pattern in `query`.
168
168
169
+
**Temporal Filtering**: Use `since` and `until` to filter by repository index time (when Sourcebot last indexed the repo). This is different from commit time. See `search_commits` for commit-time filtering.
170
+
169
171
<details>
170
172
<summary>Parameters</summary>
171
173
@@ -176,6 +178,9 @@ Fetches code that matches the provided regex pattern in `query`.
176
178
| `filterByLanguages` | no | Restrict search to specific languages (GitHub linguist format, e.g., Python, JavaScript). |
177
179
| `caseSensitive` | no | Case sensitive search (default: false). |
178
180
| `includeCodeSnippets` | no | Include code snippets in results (default: false). |
181
+
| `gitRevision` | no | Git revision to search (e.g., 'main', 'develop', 'v1.0.0'). Defaults to HEAD. |
182
+
| `since` | no | Only search repos indexed after this date. Supports ISO 8601 or relative (e.g., "30 days ago"). |
183
+
| `until` | no | Only search repos indexed before this date. Supports ISO 8601 or relative (e.g., "yesterday"). |
179
184
| `maxTokens` | no | Max tokens to return (default: env.DEFAULT_MINIMUM_TOKENS). |
180
185
</details>
181
186
@@ -184,14 +189,18 @@ Fetches code that matches the provided regex pattern in `query`.
184
189
185
190
Lists repositories indexed by Sourcebot with optional filtering and pagination.
186
191
192
+
**Temporal Filtering**: Use `activeAfter` and `activeBefore` to filter by repository index time (when Sourcebot last indexed the repo). This is the same filtering behavior as `search_code`'s `since`/`until` parameters.
| `query` | no | Filter repositories by name (case-insensitive). |
200
+
| `pageNumber` | no | Page number (1-indexed, default: 1). |
201
+
| `limit` | no | Number of repositories per page (default: 50). |
202
+
| `activeAfter` | no | Only return repos indexed after this date. Supports ISO 8601 or relative (e.g., "30 days ago"). |
203
+
| `activeBefore` | no | Only return repos indexed before this date. Supports ISO 8601 or relative (e.g., "yesterday"). |
195
204
196
205
</details>
197
206
@@ -208,6 +217,39 @@ Fetches the source code for a given file.
208
217
| `repoId` | yes | The Sourcebot repository ID. |
209
218
</details>
210
219
220
+
### search_commits
221
+
222
+
Searches for commits in a specific repository based on actual commit time (NOT index time).
223
+
224
+
**Requirements**: Repository must be cloned on the Sourcebot server disk. Sourcebot automatically clones repositories during indexing, but the cloning process may not be finished when this query is executed. Use `list_repos` first to get the repository ID.
225
+
226
+
**Date Formats**: Supports ISO 8601 dates (e.g., "2024-01-01") and relative formats (e.g., "30 days ago", "last week", "yesterday").
| `repoId` | yes | Repository identifier: either numeric database ID (e.g., 123) or full repository name (e.g., "github.com/owner/repo") as returned by `list_repos`. |
234
+
| `query` | no | Search query to filter commits by message (case-insensitive). |
235
+
| `since` | no | Show commits after this date (by commit time). Supports ISO 8601 or relative formats. |
236
+
| `until` | no | Show commits before this date (by commit time). Supports ISO 8601 or relative formats. |
237
+
| `author` | no | Filter by author name or email (supports partial matches). |
238
+
| `maxCount` | no | Maximum number of commits to return (default: 50). |
.describe(`Whether to include the code snippets in the response (default: false). If false, only the file's URL, repository, and language will be returned. Set to false to get a more concise response.`)
51
51
.optional(),
52
+
gitRevision: z
53
+
.string()
54
+
.describe(`The git revision to search in (e.g., 'main', 'HEAD', 'v1.0.0', 'a1b2c3d'). If not provided, defaults to the default branch (usually 'main' or 'master').`)
55
+
.optional(),
56
+
since: z
57
+
.string()
58
+
.describe(`Filter repositories by when they were last indexed by Sourcebot (NOT by commit time). Only searches in repos indexed after this date. Supports ISO 8601 (e.g., '2024-01-01') or relative formats (e.g., '30 days ago', 'last week', 'yesterday').`)
59
+
.optional(),
60
+
until: z
61
+
.string()
62
+
.describe(`Filter repositories by when they were last indexed by Sourcebot (NOT by commit time). Only searches in repos indexed before this date. Supports ISO 8601 (e.g., '2024-12-31') or relative formats (e.g., 'yesterday').`)
63
+
.optional(),
52
64
maxTokens: numberSchema
53
65
.describe(`The maximum number of tokens to return (default: ${env.DEFAULT_MINIMUM_TOKENS}). Higher values provide more context but consume more tokens. Values less than ${env.DEFAULT_MINIMUM_TOKENS} will be ignored.`)
query+=` ( repo:${repoIds.map(id=>escapeStringRegexp(id)).join(' or repo:')} )`;
@@ -76,6 +91,9 @@ server.tool(
76
91
contextLines: env.DEFAULT_CONTEXT_LINES,
77
92
isRegexEnabled: true,
78
93
isCaseSensitivityEnabled: caseSensitive,
94
+
gitRevision,
95
+
since,
96
+
until,
79
97
});
80
98
81
99
if(isServiceError(response)){
@@ -160,16 +178,95 @@ server.tool(
160
178
}
161
179
);
162
180
181
+
server.tool(
182
+
"search_commits",
183
+
`Searches for commits in a specific repository based on actual commit time (NOT index time).
184
+
185
+
**Requirements**: The repository must be cloned on the Sourcebot server disk. Sourcebot automatically clones repositories during indexing, but the cloning process may not be finished when this query is executed. If the repository is not found on the server disk, an error will be returned asking you to try again later.
186
+
187
+
**Date Formats**: Supports ISO 8601 (e.g., "2024-01-01") or relative formats (e.g., "30 days ago", "last week", "yesterday").
188
+
189
+
**YOU MUST** call 'list_repos' first to obtain the exact repository ID.
190
+
191
+
If you receive an error that indicates that you're not authenticated, please inform the user to set the SOURCEBOT_API_KEY environment variable.`,
192
+
{
193
+
repoId: z.union([z.number(),z.string()]).describe(`Repository identifier. Can be either:
194
+
- Numeric database ID (e.g., 123)
195
+
- Full repository name (e.g., "github.com/owner/repo") as returned by 'list_repos'
196
+
197
+
**YOU MUST** call 'list_repos' first to obtain the repository identifier.`),
198
+
query: z.string().describe(`Search query to filter commits by message content (case-insensitive).`).optional(),
199
+
since: z.string().describe(`Show commits more recent than this date. Filters by actual commit time. Supports ISO 8601 (e.g., '2024-01-01') or relative formats (e.g., '30 days ago', 'last week').`).optional(),
200
+
until: z.string().describe(`Show commits older than this date. Filters by actual commit time. Supports ISO 8601 (e.g., '2024-12-31') or relative formats (e.g., 'yesterday').`).optional(),
201
+
author: z.string().describe(`Filter commits by author name or email (supports partial matches and patterns).`).optional(),
202
+
maxCount: z.number().describe(`Maximum number of commits to return (default: 50).`).optional(),
"Lists repositories in the organization with optional filtering and pagination. If you receive an error that indicates that you're not authenticated, please inform the user to set the SOURCEBOT_API_KEY environment variable.",
166
-
listReposRequestSchema.shape,
167
-
async({ query, pageNumber =1, limit =50}: {
229
+
`Lists repositories in the organization with optional filtering and pagination.
230
+
231
+
**Temporal Filtering**: When using 'activeAfter' or 'activeBefore', only repositories indexed within the specified timeframe are returned. This filters by when Sourcebot last indexed the repository (indexedAt), NOT by git commit dates. For commit-time filtering, use 'search_commits'.
232
+
233
+
**Date Formats**: Supports ISO 8601 (e.g., "2024-01-01") and relative dates (e.g., "30 days ago", "last week", "yesterday").
234
+
235
+
If you receive an error that indicates that you're not authenticated, please inform the user to set the SOURCEBOT_API_KEY environment variable.`,
236
+
{
237
+
query: z
238
+
.string()
239
+
.describe("Filter repositories by name (case-insensitive).")
240
+
.optional(),
241
+
pageNumber: z
242
+
.number()
243
+
.int()
244
+
.positive()
245
+
.describe("Page number (1-indexed, default: 1)")
246
+
.default(1),
247
+
limit: z
248
+
.number()
249
+
.int()
250
+
.positive()
251
+
.describe("Number of repositories per page (default: 50)")
252
+
.default(50),
253
+
activeAfter: z
254
+
.string()
255
+
.describe("Only return repositories indexed after this date (filters by indexedAt). Supports ISO 8601 (e.g., '2024-01-01') or relative formats (e.g., '30 days ago', 'last week').")
256
+
.optional(),
257
+
activeBefore: z
258
+
.string()
259
+
.describe("Only return repositories indexed before this date (filters by indexedAt). Supports ISO 8601 (e.g., '2024-12-31') or relative formats (e.g., 'yesterday').")
0 commit comments