Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow quick & shallow copies of Git repos to select which repos to index #1115

Merged
merged 11 commits into from
Nov 21, 2023

Conversation

rsdy
Copy link
Contributor

@rsdy rsdy commented Nov 6, 2023

Initiate the clone with:

; curl  -D - "localhost:7878/api/repos/sync?repo=github.com/bloopai/bloop&shallow=true"

This will create a shallow clone (1 commit depth), and a fast index of the repository in question, without looking at the body of any of the files.

After this process is done, the repository ends up in a Shallow state, which means it won't automatically be refreshed until the file filters are configured.

To configure file filters, use the following call:

; curl 'http://127.0.0.1:7878/api/repos/indexed?repo=github.com%2Fbloopai%2Fbloop' \
-X 'PATCH' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json, text/plain, */*' \
--data-binary '{"file_filter": {"rules": [{"include_file": "flake.lock"}, {"include_regex": "^server/"}]}}'

The first call to this will automatically create a Git clone of 1000 commits in depth, and allows syncing the repo fully with the configured file filters.

Note that file filters are additive, and prioritise inclusion over exclusion. In other words, once something's been added to the index, can't be marked to be removed from the index without removing the repository completely!

Copy link

gitpod-io bot commented Nov 6, 2023

@anastasiya1155
Copy link
Collaborator

I tried it with your example query and also with the reverse one (exclude_file and exclude_regex) and as far as I can see it doesn't index anything by default and only files in 'include_regex' and 'include_file' were indexed. In the case of exclude_file and exclude_regex nothing was indexed at all. Can we change that so that everything is included by default and the user will use 'exclude_' fields to ignore some files and folders?

Copy link
Collaborator

@oppiliappan oppiliappan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code changes look good to me, will give this a test.

server/bleep/src/repo.rs Show resolved Hide resolved
server/bleep/src/indexes/file.rs Show resolved Hide resolved
@rsdy rsdy force-pushed the rsdy/blo-1818-includeexclude-directories-from-index branch from 70d7895 to e2445b5 Compare November 17, 2023 12:31
Copy link
Contributor

@ggordonhall ggordonhall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works well and code LGTM.

Before merging can you make sure that the commits.rs flow only runs after we've done a deep clone of the repo. As is it panics during the shallow clone step.

We don't need the other criterion, as the callee function will hit the
db to determine if there's work to be done.
@rsdy rsdy merged commit eac89c6 into main Nov 21, 2023
5 checks passed
@rsdy rsdy deleted the rsdy/blo-1818-includeexclude-directories-from-index branch November 21, 2023 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants