-
-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check if file is ignored according to git #98
Conversation
doesn't launch the job if the file is ignored (this should be guarded by a configuration parameter)
To test this and have more insight about the exclusion process, launch it with info level log:
|
in the job configuration
src/ignorer.rs
Outdated
let index = worktree.index()?; | ||
|
||
// there doesn't seem to be any public API for looking at "excludes" without caching | ||
// so we create a cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache is probably a misnomer here as it doesn't cache anything - it merely builds state to be able to perform these lookups and be fast with non-random input ordered according to the git index.
I think I ran out of words as it contains an ignore stack
internally, and this is adding even more information to bring everything together.
|
||
/// Tell whether the given path is excluded according to | ||
/// either the global gitignore rules or the ones of the repository | ||
pub fn excludes(&mut self, file_path: &Path) -> Result<bool> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as it stands, each path will trigger reading all gitignore files. They are indeed held in the excludes()
data structure and ideally it is kept. I see how this isn't possible right now, and believe that the current reference is likely a premature optimization rather than a necessity. This will change for sure - and it's done in the latest main, which should greatly improve performance as the cache can actually reuse state if it's kept around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current reference is likely a premature optimization rather than a necessity
I feel you. I fell in the same trap in my first libs too, and it's a pain to fix.
TBH checking whether a file is excluded takes less than 1 ms right now and that's fine for bacon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will think about this - the number one usage of lifetimes in structs is platforms, they add some data around an operation and perform it, keeping a reference to their originating Repository
. As long as these are basically free, I think they can be created on demand with the Repository
being cloned to where it is needed - that's the intent.
If they aren't free though, like the Cache
here, I think it's good advice to rather clone the Repository
into it to make it standalone, or do whatever else it takes. I think some useful rule emerges from this experience and I will put it into words in DEVELOPMENT.md
to make it official.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW @Byron I don't know if you may find this interesting, but I've also implemented a gitignoring stack (stacking the parsed gitignore files as I imagine you do): https://github.com/Canop/broot/blob/main/src/git/ignore.rs#L155
This was done for broot with very specific performance concerns (breathfirst tree diving).
The reasons I didn't take that for bacon were
- I didn't want to implement myself looking for the global gitignore rules (I use git2 in broot for that)
- I wanted to try gitoxide for other programs of mine (and maybe replacing git2 with gitoxide in my current programs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing! I love the perceived simplicity of the git-ignore implementation, it fits in 240 lines after all :)!
With the Cache
type unchained from the lifetime in main
you would now be able to reuse it for each lookup and that should yield much better performance.
From a correctness point of view, it's probably (hopefully) a good idea to use gitoxide
even if the performance is just similar, as I tried my best to validate the implementation against git with many many test cases. Of course I hope you won't take my word for it and validate it yourself, gitoxide
strives to yield the same results as git
.
Thus, I hope you will end up using gitoxide
in more of your projects and if there is anything preventing that, I'd love to know to get a chance to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gitignoring is only part of using git, and I definitely don't want to expand into building a general git crate while there's already an ambitious project, so it's my clear intention to try and use gitoxide ;)
Related to #675 and Canop/bacon#98.
Doesn't launch the job if the file is ignored
There's a configuration parameter at the job level to disable this filtering.
Fix #32