-
-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diff correctness #1106
Merged
Merged
diff correctness #1106
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Byron
force-pushed
the
gix-status
branch
9 times, most recently
from
November 18, 2023 20:50
f04cdb4
to
8d06b29
Compare
Byron
force-pushed
the
gix-status
branch
2 times, most recently
from
November 24, 2023 20:11
d257099
to
c4e4714
Compare
…dHeader` trait. That way one can know its decompressed size and its kind. We also add a `FindObjectOrHeader` trait for use as `dyn` trait object that can find objects and access their headers.
Byron
force-pushed
the
gix-status
branch
7 times, most recently
from
November 28, 2023 08:41
5ed5ce7
to
9faf3f3
Compare
Note that this is also the minimal required version that is resolved with `cargo +nightly update -Z minimal-versions`, but it's nothing I could validate or reproduce myself just yet.
It allows to more easily manage a form of 'double buffering' to better manage conditional alteration of a source buffer, and to implement conversion pipelines which conditionally transform an input over multiple steps.
It's required, but in practice has no effect as it's initialized at just the right time anyway, which is when it does matter. Also, re-export `gix_attributes as attributes` to allow using the types it mentions in the public API.
An attribute selection affects the initialization, hence it should be added first.
As otherwise, one cannot use `&dyn ` at all in this case as it's unsized.` Additionally, rename top-level `pub use gix_glob` to `glob` to be in-line with other public exports of this kind.
Byron
force-pushed
the
gix-status
branch
11 times, most recently
from
December 2, 2023 16:24
5ab67b2
to
7aafd09
Compare
…ersions and caching. The `Pipeline` provides ways to obtain content for use with a diffing algorithm, and the `Platform` is a way to cache such content to efficiently perform MxN matrix diffing, and possibly to prepare running external diff programs as well.
…timized diffing. Correctness is improved as all necessary transformation are now performed. Performance is improved by avoiding duplicate work by caching transformed diffable data for later reuse.
…memory diffing of combinations of resources. We also add the `object::tree::diff::Platform::for_each_to_obtain_tree_with_cache()` to pass a resource-cache for re-use between multiple invocation for significant savings.
That way it's conceivable that applications correctly run either a configured external diff tool, or one that is configured on a per diff-driver basis, while being allowed to fall back to a built-in implementation as needed.
It can handle it, so let's let it be a no-op.
…ersions. Previously it would just offer the git-ODB version of a blob for diffing, while it will now make it possible to apply all necessary conversion steps for you. This also moves `Event::diff()` to `Change::diff()`, adds `Repository::diff_resource_cache()` and refactors nearly everything about the `objects::blob::diff::Platform`.
This was referenced Dec 2, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Based on #1049
diff-correctness
→ gix-status → gix resetImprove
gix status
to the point where it's suitable for use inreset
functinoality.Leads to a proper worktree reset implementation, eventually leading to a high-level reset similar to how git supports it.
Architecture
The reason this PR deals quite a bit with
gix status
is that for a safe implementation ofreset()
we need to be sure that the files we would want to touch don't don't carry modifications or are untracked files. In order to know what would need to be done, we have to diff thecurrent-index with target-index
. The set of files to touch can then be used to lookup information provided bygit-status
, like worktree modifications, index modifications, and untracked files, to know if we can proceed or not. Here is also where the reset-modes would affect the outcome, i.e. what to change and how.This is a very modular approach which facilitates testing and understanding of what otherwise would be a very complex algorithm. Having a set of changes as output also allows to one day parallelize applying these changes.
This leaves us in a situation where the current
checkout()
implementation wants to become a fastpath for situations where the reset involves an empty tree as source (i.e. create everything and overwrite local changes).On the way to
reset()
it's a valid choice to warm up more with the matter by improving on the currentgix status
implementation and assure correctness of what's there, which currently doesn't seem to be the case in comparison. Further, implementinggix status
similarly togit status
should be made possible.Tasks Diff Correctness
diff.driver
for documentation purposes (and a test).git
or to worktree + textconv - this is needed as depending on the storage location, different content is diffed or used as base.- rewrite-tracking uses what's stored inside of
git
(pretty sure)- user-diffing uses worktree + textconv, but only if textconf is specified
diff.external
in config-tree, probably with key)gix
usesgix-diff::blob::Platform
to properly do all conversionsNext PR: Gix Status
git2
can do that. Needs generalization of what's available fortree/tree
diffs, at least learn from it.gix
cratecat-file
equivalent, and possiblytextconv
conversions just like in `git cat-file.diff tree with index (with reverse-diff functionality to simulate diff of index with tree), for better performance as it
would avoid having to allocate a whole index even though we are only interested in a diff. Must include rename tracking.
Next PR: Reset
reset()
that checks if it's allowed to perform a worktree modification is allowed, or if an entry should be skipped. That way we can postpone safety checks like --hardPostponed
What follows is important for resets, but won't be needed for
cargo
worktree resets.gix status
with actual submodule support - needsstatus
ingix
(crate) effectivelygix status
with actual conflict supportResearch
gix status
can deal a little better with submodules. Even though in this case a lot of submodule-related information is needed for a complete reset, probably only doable by a higher-level caller which orchestrates it.merge
andkeep
? How to controlrefresh
? Maybe partial (only the files we touch), and full, to also update the files we don't touch as part of status? Maybe it's part of status if that is run before.git reset
andgit checkout
in terms ofHEAD
modifications. With the former changingHEAD
s referent, and the latter changingHEAD
itself.checkout()
method as technically that's areset --hard
with optional overwrite check. Could it be rolled into one, with pathspec support added?reset()
performs just as well, which is unlikely as there is more overhead. But maybe it's not worth to maintain two versions over it. But if so, one should probably rename it.git status
: what about rename tracking? It's available for tree-diffs and quite complex on its own. Probably only needs HEAD-vs-index rename tracking. No, also can have worktree rename tracking, even though it's hard to imagine how this can be fast unless it's tightly integrated with untracked-files handling. This screams for a generalization of the tracking code though as the testing and implementation is complex, but should be generalisable.