Skip to content

[GitHub] Add exclude.size property to the config #137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added

- Added config option `settings.reindexInterval` and `settings.resyncInterval` to control how often the index should be re-indexed and re-synced. ([#134](https://github.com/sourcebot-dev/sourcebot/pull/134))
- Added `exclude.size` to the GitHub config to allow excluding repositories by size. ([#137](https://github.com/sourcebot-dev/sourcebot/pull/137))

## [2.6.2] - 2024-12-13

Expand Down
5 changes: 5 additions & 0 deletions demo-site-config.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@
"token": {
"env": "GITHUB_TOKEN"
},
"exclude": {
"size": {
"max": 1000000000 // Limit to 1GB
}
},
"repos": [
"torvalds/linux",
"pytorch/pytorch",
Expand Down
38 changes: 38 additions & 0 deletions packages/backend/src/github.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ type OctokitRepository = {
forks_count?: number,
archived?: boolean,
topics?: string[],
size?: number,
}

export const getGitHubReposFromConfig = async (config: GitHubConfig, signal: AbortSignal, ctx: AppContext) => {
Expand Down Expand Up @@ -94,6 +95,7 @@ export const getGitHubReposFromConfig = async (config: GitHubConfig, signal: Abo
'zoekt.fork': marshalBool(repo.fork),
'zoekt.public': marshalBool(repo.private === false)
},
sizeInBytes: repo.size ? repo.size * 1000 : undefined,
branches: [],
tags: [],
} satisfies GitRepository;
Expand Down Expand Up @@ -121,6 +123,42 @@ export const getGitHubReposFromConfig = async (config: GitHubConfig, signal: Abo
const topics = config.exclude.topics.map(topic => topic.toLowerCase());
repos = excludeReposByTopic(repos, topics, logger);
}

if (config.exclude.size) {
const min = config.exclude.size.min;
const max = config.exclude.size.max;
if (min) {
repos = repos.filter((repo) => {
// If we don't have a size, we can't filter by size.
if (!repo.sizeInBytes) {
return true;
}

if (repo.sizeInBytes < min) {
logger.debug(`Excluding repo ${repo.name}. Reason: repo is less than \`exclude.size.min\`=${min} bytes.`);
return false;
}

return true;
});
}

if (max) {
repos = repos.filter((repo) => {
// If we don't have a size, we can't filter by size.
if (!repo.sizeInBytes) {
return true;
}

if (repo.sizeInBytes > max) {
logger.debug(`Excluding repo ${repo.name}. Reason: repo is greater than \`exclude.size.max\`=${max} bytes.`);
return false;
}

return true;
});
}
}
}

logger.debug(`Found ${repos.length} total repositories.`);
Expand Down
13 changes: 13 additions & 0 deletions packages/backend/src/schemas/v2.ts
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,19 @@ export interface GitHubConfig {
* List of repository topics to exclude when syncing. Repositories that match one of the provided `topics` will be excluded from syncing. Glob patterns are supported.
*/
topics?: string[];
/**
* Exclude repositories based on their disk usage. Note: the disk usage is calculated by GitHub and may not reflect the actual disk usage when cloned.
*/
size?: {
/**
* Minimum repository size (in bytes) to sync (inclusive). Repositories less than this size will be excluded from syncing.
*/
min?: number;
/**
* Maximum repository size (in bytes) to sync (inclusive). Repositories greater than this size will be excluded from syncing.
*/
max?: number;
};
};
revisions?: GitRevisions;
}
Expand Down
1 change: 1 addition & 0 deletions packages/backend/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ interface BaseRepository {
isArchived?: boolean;
codeHost?: string;
topics?: string[];
sizeInBytes?: number;
}

export interface GitRepository extends BaseRepository {
Expand Down
15 changes: 15 additions & 0 deletions schemas/v2/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,21 @@
"examples": [
["tests", "ci"]
]
},
"size": {
"type": "object",
"description": "Exclude repositories based on their disk usage. Note: the disk usage is calculated by GitHub and may not reflect the actual disk usage when cloned.",
"properties": {
"min": {
"type": "integer",
"description": "Minimum repository size (in bytes) to sync (inclusive). Repositories less than this size will be excluded from syncing."
},
"max": {
"type": "integer",
"description": "Maximum repository size (in bytes) to sync (inclusive). Repositories greater than this size will be excluded from syncing."
}
},
"additionalProperties": false
}
},
"additionalProperties": false
Expand Down
Loading