Skip to content
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions packages/backend/src/constants.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import { CodeHostType } from "@sourcebot/db";
import { env } from "@sourcebot/shared";
import path from "path";


export const SINGLE_TENANT_ORG_ID = 1;

Expand All @@ -9,8 +8,7 @@ export const PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES: CodeHostType[] = [
'gitlab',
];

export const REPOS_CACHE_DIR = path.join(env.DATA_CACHE_DIR, 'repos');
export const INDEX_CACHE_DIR = path.join(env.DATA_CACHE_DIR, 'index');
export { REPOS_CACHE_DIR, INDEX_CACHE_DIR } from "@sourcebot/shared";

// Maximum time to wait for current job to finish
export const GROUPMQ_WORKER_STOP_GRACEFUL_TIMEOUT_MS = 5 * 1000; // 5 seconds
Expand Down
16 changes: 16 additions & 0 deletions packages/mcp/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added
- Added comprehensive relative date support for all temporal parameters (e.g., "30 days ago", "last week", "yesterday")
- Added `search_commits` tool to search commits by actual commit time with full temporal filtering. Accepts both numeric database IDs (e.g., 123) and string repository names (e.g., "github.com/owner/repo") for the `repoId` parameter, allowing direct use of repository names from `list_repos` output
- Added `since`/`until` parameters to `search_code` (filters by index time - when Sourcebot indexed the repo)
- Added `gitRevision` parameter to `search_code`
- Added `activeAfter`/`activeBefore` parameters to `list_repos` (filters by commit time - actual git commit dates)
- Added date range validation to prevent invalid date ranges (since > until)
- Added 30-second timeout for git operations to handle large repositories
- Added enhanced error messages for git operations (timeout, repository not found, invalid git repository, ambiguous arguments)
- Added clarification that repositories must be cloned on Sourcebot server disk for `search_commits` to work
- Added comprehensive temporal parameter documentation to README with clear distinction between index time and commit time filtering
- Added comprehensive unit tests for date parsing utilities (90+ test cases)
- Added unit tests for git commit search functionality with mocking
- Added integration tests for temporal parameter validation
- Added unit tests for repository identifier resolution (both string and number types)

## [1.0.9] - 2025-11-17

### Added
Expand Down
52 changes: 47 additions & 5 deletions packages/mcp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,8 @@ For a more detailed guide, checkout [the docs](https://docs.sourcebot.dev/docs/f

Fetches code that matches the provided regex pattern in `query`.

**Temporal Filtering**: Use `since` and `until` to filter by repository index time (when Sourcebot last indexed the repo). This is different from commit time. See `search_commits` for commit-time filtering.

<details>
<summary>Parameters</summary>

Expand All @@ -176,6 +178,9 @@ Fetches code that matches the provided regex pattern in `query`.
| `filterByLanguages` | no | Restrict search to specific languages (GitHub linguist format, e.g., Python, JavaScript). |
| `caseSensitive` | no | Case sensitive search (default: false). |
| `includeCodeSnippets` | no | Include code snippets in results (default: false). |
| `gitRevision` | no | Git revision to search (e.g., 'main', 'develop', 'v1.0.0'). Defaults to HEAD. |
| `since` | no | Only search repos indexed after this date. Supports ISO 8601 or relative (e.g., "30 days ago"). |
| `until` | no | Only search repos indexed before this date. Supports ISO 8601 or relative (e.g., "yesterday"). |
| `maxTokens` | no | Max tokens to return (default: env.DEFAULT_MINIMUM_TOKENS). |
</details>

Expand All @@ -184,14 +189,18 @@ Fetches code that matches the provided regex pattern in `query`.

Lists repositories indexed by Sourcebot with optional filtering and pagination.

**Temporal Filtering**: Use `activeAfter` and `activeBefore` to filter repositories by actual commit activity (requires repositories to be cloned locally).

<details>
<summary>Parameters</summary>

| Name | Required | Description |
|:-------------|:---------|:--------------------------------------------------------------------|
| `query` | no | Filter repositories by name (case-insensitive). |
| `pageNumber` | no | Page number (1-indexed, default: 1). |
| `limit` | no | Number of repositories per page (default: 50). |
| Name | Required | Description |
|:----------------|:---------|:-----------------------------------------------------------------------------------------------|
| `query` | no | Filter repositories by name (case-insensitive). |
| `pageNumber` | no | Page number (1-indexed, default: 1). |
| `limit` | no | Number of repositories per page (default: 50). |
| `activeAfter` | no | Only return repos with commits after this date. Supports ISO 8601 or relative (e.g., "30 days ago"). |
| `activeBefore` | no | Only return repos with commits before this date. Supports ISO 8601 or relative (e.g., "yesterday"). |

</details>

Expand All @@ -208,6 +217,39 @@ Fetches the source code for a given file.
| `repoId` | yes | The Sourcebot repository ID. |
</details>

### search_commits

Searches for commits in a specific repository based on actual commit time (NOT index time).

**Requirements**: Repository must be cloned on the Sourcebot server disk. Sourcebot automatically clones repositories during indexing, but the cloning process may not be finished when this query is executed. Use `list_repos` first to get the repository ID.

**Date Formats**: Supports ISO 8601 dates (e.g., "2024-01-01") and relative formats (e.g., "30 days ago", "last week", "yesterday").

<details>
<summary>Parameters</summary>

| Name | Required | Description |
|:-----------|:---------|:-----------------------------------------------------------------------------------------------|
| `repoId` | yes | Repository identifier: either numeric database ID (e.g., 123) or full repository name (e.g., "github.com/owner/repo") as returned by `list_repos`. |
| `query` | no | Search query to filter commits by message (case-insensitive). |
| `since` | no | Show commits after this date (by commit time). Supports ISO 8601 or relative formats. |
| `until` | no | Show commits before this date (by commit time). Supports ISO 8601 or relative formats. |
| `author` | no | Filter by author name or email (supports partial matches). |
| `maxCount` | no | Maximum number of commits to return (default: 50). |

</details>

## Date Format Examples

All temporal parameters support:
- **ISO 8601**: `"2024-01-01"`, `"2024-12-31T23:59:59Z"`
- **Relative dates**: `"30 days ago"`, `"1 week ago"`, `"last month"`, `"yesterday"`

**Important**: `search_code` and `list_repos` temporal filters work differently:
- `search_code` `since`/`until`: Filters by **index time** (when Sourcebot indexed the repo)
- `list_repos` `activeAfter`/`activeBefore`: Filters by **commit time** (actual git commit dates)
- `search_commits` `since`/`until`: Filters by **commit time** (actual git commit dates)


## Supported Code Hosts
Sourcebot supports the following code hosts:
Expand Down
34 changes: 30 additions & 4 deletions packages/mcp/src/client.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { env } from './env.js';
import { listRepositoriesResponseSchema, searchResponseSchema, fileSourceResponseSchema } from './schemas.js';
import { FileSourceRequest, FileSourceResponse, ListRepositoriesResponse, SearchRequest, SearchResponse, ServiceError } from './types.js';
import { listRepositoriesResponseSchema, searchResponseSchema, fileSourceResponseSchema, searchCommitsResponseSchema } from './schemas.js';
import { FileSourceRequest, FileSourceResponse, ListRepositoriesResponse, SearchRequest, SearchResponse, ServiceError, SearchCommitsRequest, SearchCommitsResponse } from './types.js';
import { isServiceError } from './utils.js';

export const search = async (request: SearchRequest): Promise<SearchResponse | ServiceError> => {
Expand All @@ -21,8 +21,16 @@ export const search = async (request: SearchRequest): Promise<SearchResponse | S
return searchResponseSchema.parse(result);
}

export const listRepos = async (): Promise<ListRepositoriesResponse | ServiceError> => {
const result = await fetch(`${env.SOURCEBOT_HOST}/api/repos`, {
export const listRepos = async (params?: { activeAfter?: string, activeBefore?: string }): Promise<ListRepositoriesResponse | ServiceError> => {
const url = new URL(`${env.SOURCEBOT_HOST}/api/repos`);
if (params?.activeAfter) {
url.searchParams.append('activeAfter', params.activeAfter);
}
if (params?.activeBefore) {
url.searchParams.append('activeBefore', params.activeBefore);
}

const result = await fetch(url.toString(), {
method: 'GET',
headers: {
'Content-Type': 'application/json',
Expand Down Expand Up @@ -55,3 +63,21 @@ export const getFileSource = async (request: FileSourceRequest): Promise<FileSou

return fileSourceResponseSchema.parse(result);
}

export const searchCommits = async (request: SearchCommitsRequest): Promise<SearchCommitsResponse | ServiceError> => {
const result = await fetch(`${env.SOURCEBOT_HOST}/api/commits`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-Org-Domain': '~',
...(env.SOURCEBOT_API_KEY ? { 'X-Sourcebot-Api-Key': env.SOURCEBOT_API_KEY } : {})
},
body: JSON.stringify(request)
}).then(response => response.json());

if (isServiceError(result)) {
return result;
}

return searchCommitsResponseSchema.parse(result);
}
Loading
Loading