Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions packages/backend/src/constants.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import { CodeHostType } from "@sourcebot/db";
import { env } from "@sourcebot/shared";
import path from "path";


export const SINGLE_TENANT_ORG_ID = 1;

Expand All @@ -9,8 +8,7 @@ export const PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES: CodeHostType[] = [
'gitlab',
];

export const REPOS_CACHE_DIR = path.join(env.DATA_CACHE_DIR, 'repos');
export const INDEX_CACHE_DIR = path.join(env.DATA_CACHE_DIR, 'index');
export { REPOS_CACHE_DIR, INDEX_CACHE_DIR } from "@sourcebot/shared";

// Maximum time to wait for current job to finish
export const GROUPMQ_WORKER_STOP_GRACEFUL_TIMEOUT_MS = 5 * 1000; // 5 seconds
Expand Down
16 changes: 16 additions & 0 deletions packages/mcp/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added
- Added comprehensive relative date support for all temporal parameters (e.g., "30 days ago", "last week", "yesterday")
- Added `search_commits` tool to search commits by actual commit time with full temporal filtering. Accepts both numeric database IDs (e.g., 123) and string repository names (e.g., "github.com/owner/repo") for the `repoId` parameter, allowing direct use of repository names from `list_repos` output
- Added `since`/`until` parameters to `search_code` (filters by index time - when Sourcebot indexed the repo)
- Added `gitRevision` parameter to `search_code`
- Added `activeAfter`/`activeBefore` parameters to `list_repos` (filters by commit time - actual git commit dates)
- Added date range validation to prevent invalid date ranges (since > until)
- Added 30-second timeout for git operations to handle large repositories
- Added enhanced error messages for git operations (timeout, repository not found, invalid git repository, ambiguous arguments)
- Added clarification that repositories must be cloned on Sourcebot server disk for `search_commits` to work
- Added comprehensive temporal parameter documentation to README with clear distinction between index time and commit time filtering
- Added comprehensive unit tests for date parsing utilities (90+ test cases)
- Added unit tests for git commit search functionality with mocking
- Added integration tests for temporal parameter validation
- Added unit tests for repository identifier resolution (both string and number types)

## [1.0.9] - 2025-11-17

### Added
Expand Down
52 changes: 47 additions & 5 deletions packages/mcp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,8 @@ For a more detailed guide, checkout [the docs](https://docs.sourcebot.dev/docs/f

Fetches code that matches the provided regex pattern in `query`.

**Temporal Filtering**: Use `since` and `until` to filter by repository index time (when Sourcebot last indexed the repo). This is different from commit time. See `search_commits` for commit-time filtering.

<details>
<summary>Parameters</summary>

Expand All @@ -176,6 +178,9 @@ Fetches code that matches the provided regex pattern in `query`.
| `filterByLanguages` | no | Restrict search to specific languages (GitHub linguist format, e.g., Python, JavaScript). |
| `caseSensitive` | no | Case sensitive search (default: false). |
| `includeCodeSnippets` | no | Include code snippets in results (default: false). |
| `gitRevision` | no | Git revision to search (e.g., 'main', 'develop', 'v1.0.0'). Defaults to HEAD. |
| `since` | no | Only search repos indexed after this date. Supports ISO 8601 or relative (e.g., "30 days ago"). |
| `until` | no | Only search repos indexed before this date. Supports ISO 8601 or relative (e.g., "yesterday"). |
| `maxTokens` | no | Max tokens to return (default: env.DEFAULT_MINIMUM_TOKENS). |
</details>

Expand All @@ -184,14 +189,18 @@ Fetches code that matches the provided regex pattern in `query`.

Lists repositories indexed by Sourcebot with optional filtering and pagination.

**Temporal Filtering**: Use `activeAfter` and `activeBefore` to filter repositories by actual commit activity (requires repositories to be cloned locally).

<details>
<summary>Parameters</summary>

| Name | Required | Description |
|:-------------|:---------|:--------------------------------------------------------------------|
| `query` | no | Filter repositories by name (case-insensitive). |
| `pageNumber` | no | Page number (1-indexed, default: 1). |
| `limit` | no | Number of repositories per page (default: 50). |
| Name | Required | Description |
|:----------------|:---------|:-----------------------------------------------------------------------------------------------|
| `query` | no | Filter repositories by name (case-insensitive). |
| `pageNumber` | no | Page number (1-indexed, default: 1). |
| `limit` | no | Number of repositories per page (default: 50). |
| `activeAfter` | no | Only return repos with commits after this date. Supports ISO 8601 or relative (e.g., "30 days ago"). |
| `activeBefore` | no | Only return repos with commits before this date. Supports ISO 8601 or relative (e.g., "yesterday"). |

</details>

Expand All @@ -208,6 +217,39 @@ Fetches the source code for a given file.
| `repoId` | yes | The Sourcebot repository ID. |
</details>

### search_commits

Searches for commits in a specific repository based on actual commit time (NOT index time).

**Requirements**: Repository must be cloned on the Sourcebot server disk. Sourcebot automatically clones repositories during indexing, but the cloning process may not be finished when this query is executed. Use `list_repos` first to get the repository ID.

**Date Formats**: Supports ISO 8601 dates (e.g., "2024-01-01") and relative formats (e.g., "30 days ago", "last week", "yesterday").

<details>
<summary>Parameters</summary>

| Name | Required | Description |
|:-----------|:---------|:-----------------------------------------------------------------------------------------------|
| `repoId` | yes | Repository identifier: either numeric database ID (e.g., 123) or full repository name (e.g., "github.com/owner/repo") as returned by `list_repos`. |
| `query` | no | Search query to filter commits by message (case-insensitive). |
| `since` | no | Show commits after this date (by commit time). Supports ISO 8601 or relative formats. |
| `until` | no | Show commits before this date (by commit time). Supports ISO 8601 or relative formats. |
| `author` | no | Filter by author name or email (supports partial matches). |
| `maxCount` | no | Maximum number of commits to return (default: 50). |

</details>

## Date Format Examples

All temporal parameters support:
- **ISO 8601**: `"2024-01-01"`, `"2024-12-31T23:59:59Z"`
- **Relative dates**: `"30 days ago"`, `"1 week ago"`, `"last month"`, `"yesterday"`

**Important**: `search_code` and `list_repos` temporal filters work differently:
- `search_code` `since`/`until`: Filters by **index time** (when Sourcebot indexed the repo)
- `list_repos` `activeAfter`/`activeBefore`: Filters by **commit time** (actual git commit dates)
- `search_commits` `since`/`until`: Filters by **commit time** (actual git commit dates)


## Supported Code Hosts
Sourcebot supports the following code hosts:
Expand Down
34 changes: 30 additions & 4 deletions packages/mcp/src/client.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { env } from './env.js';
import { listRepositoriesResponseSchema, searchResponseSchema, fileSourceResponseSchema } from './schemas.js';
import { FileSourceRequest, FileSourceResponse, ListRepositoriesResponse, SearchRequest, SearchResponse, ServiceError } from './types.js';
import { listRepositoriesResponseSchema, searchResponseSchema, fileSourceResponseSchema, searchCommitsResponseSchema } from './schemas.js';
import { FileSourceRequest, FileSourceResponse, ListRepositoriesResponse, SearchRequest, SearchResponse, ServiceError, SearchCommitsRequest, SearchCommitsResponse } from './types.js';
import { isServiceError } from './utils.js';

export const search = async (request: SearchRequest): Promise<SearchResponse | ServiceError> => {
Expand All @@ -21,8 +21,16 @@ export const search = async (request: SearchRequest): Promise<SearchResponse | S
return searchResponseSchema.parse(result);
}

export const listRepos = async (): Promise<ListRepositoriesResponse | ServiceError> => {
const result = await fetch(`${env.SOURCEBOT_HOST}/api/repos`, {
export const listRepos = async (params?: { activeAfter?: string, activeBefore?: string }): Promise<ListRepositoriesResponse | ServiceError> => {
const url = new URL(`${env.SOURCEBOT_HOST}/api/repos`);
if (params?.activeAfter) {
url.searchParams.append('activeAfter', params.activeAfter);
}
if (params?.activeBefore) {
url.searchParams.append('activeBefore', params.activeBefore);
}

const result = await fetch(url.toString(), {
method: 'GET',
headers: {
'Content-Type': 'application/json',
Expand Down Expand Up @@ -55,3 +63,21 @@ export const getFileSource = async (request: FileSourceRequest): Promise<FileSou

return fileSourceResponseSchema.parse(result);
}

export const searchCommits = async (request: SearchCommitsRequest): Promise<SearchCommitsResponse | ServiceError> => {
const result = await fetch(`${env.SOURCEBOT_HOST}/api/commits`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-Org-Domain': '~',
...(env.SOURCEBOT_API_KEY ? { 'X-Sourcebot-Api-Key': env.SOURCEBOT_API_KEY } : {})
},
body: JSON.stringify(request)
}).then(response => response.json());

if (isServiceError(result)) {
return result;
}

return searchCommitsResponseSchema.parse(result);
}
Loading