Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

octopus-merge (part 1: tree-editing) #1566

Merged
merged 17 commits into from
Sep 5, 2024
Merged

octopus-merge (part 1: tree-editing) #1566

merged 17 commits into from
Sep 5, 2024

Conversation

Byron
Copy link
Owner

@Byron Byron commented Aug 30, 2024

Implement an octopus merge based on trees, and (mostly) equivalent to merge-ORT in Git.

Related to gitbutlerapp/gitbutler#4793.

Tasks

  • tree-editor plumbing
  • cursor support for editor
  • prevent writing grocely invalid trees (\0 in filename, should be validate path components?)
  • benchmark with/without cursor
  • tree-editor in gix
    • proper validation with gix_validate::path::component()
    • convenience by default by making it easier to pass relative paths?
    • cursor support
    • write in-memory

Next PR

  • git2-equivalent merge (first version essentially, hoping it's not as complex as a full merge-ORT)

Research

Everything is about MergeORT.

  • it uses an empty tree if there is no merge-base - we must allow the same.
  • it allows for multiple merge-bases, creating a virtual one by merging all merge-bases together using the same algorithm, recursively.
  • merges can have conflicts without a individual files being involved, for instance when directory renames clash
  • Must make sure that possible types of conflicts are properly communicated, to not degenerate information
  • It puts conflict-markers in the result tree, with annotations to provide additional context
  • Need resolution configuration, see git2::MergeOptions.

Handle Special Cases

  • A file was renamed differently
  • deal with "merge.directoryRenames"

Questions

Is git2::merge_trees() a trivial merge? Does it handle all the cases of MergeORT?

How does rename-tracking affect a tree-merge?

How is an octopus merge implemented, particularly with Merge ORT?

References

Otherwise, the `GIX_VERSION` environment variable is not available at build time,
which can lead to runtime errors.
The buffer will now be previded from the free-list of the repository.
…st`.

That way, one day we can turn this type into a compatible one which produce
different kinds of hashes as well.
@Byron Byron force-pushed the merge branch 2 times, most recently from 8e66591 to 2bfe350 Compare September 3, 2024 13:56
gix-object/src/tree/editor.rs Outdated Show resolved Hide resolved
With it it's easy to alter existing trees or build entirely new ones,
efficiently.
@Byron Byron force-pushed the merge branch 2 times, most recently from e1c4e66 to 6af2abc Compare September 3, 2024 19:44
@Byron Byron changed the title octopus-merge octopus-merge (part 1: tree-editing) Sep 3, 2024
@Byron Byron force-pushed the merge branch 6 times, most recently from cf15fd0 to e8a9531 Compare September 4, 2024 20:55
…y of entry names.

Previously, it would allow null-bytes in the name which would corrupt the written tree.
Now this is forbidden.
For some reason, it disallowed newlines, but that is now allowed as validation is should
be handled on a higher level.
It's faster throughout the board.

```
❯ cargo bench -p gix-object@0.44.0 --bench edit-tree
   Compiling gix-object v0.44.0 (/Users/byron/dev/github.com/Byron/gitoxide/gix-object)
   Compiling gix-pack v0.53.0 (/Users/byron/dev/github.com/Byron/gitoxide/gix-pack)
   Compiling gix-odb v0.63.0 (/Users/byron/dev/github.com/Byron/gitoxide/gix-odb)
    Finished `bench` profile [optimized] target(s) in 5.97s
     Running benches/edit_tree.rs (target/release/deps/edit_tree-6af6651a1c453a05)
Gnuplot not found, using plotters backend
editor/small tree (empty -> full -> empty)
                        time:   [2.5972 µs 2.6019 µs 2.6075 µs]
                        thrpt:  [3.8351 Melem/s 3.8434 Melem/s 3.8503 Melem/s]
                 change:
                        time:   [-32.618% -32.355% -32.038%] (p = 0.00 < 0.05)
                        thrpt:  [+47.142% +47.831% +48.409%]
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  13 (13.00%) high mild
  1 (1.00%) high severe
editor/deeply nested tree (empty -> full -> empty)
                        time:   [8.2019 µs 8.2079 µs 8.2145 µs]
                        thrpt:  [5.5998 Melem/s 5.6043 Melem/s 5.6084 Melem/s]
                 change:
                        time:   [-33.517% -33.377% -33.246%] (p = 0.00 < 0.05)
                        thrpt:  [+49.804% +50.099% +50.415%]
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  8 (8.00%) high mild
  5 (5.00%) high severe

cursor/small tree (empty -> full -> empty)
                        time:   [2.6911 µs 2.6935 µs 2.6961 µs]
                        thrpt:  [3.7090 Melem/s 3.7127 Melem/s 3.7160 Melem/s]
                 change:
                        time:   [-33.881% -33.546% -33.225%] (p = 0.00 < 0.05)
                        thrpt:  [+49.757% +50.480% +51.242%]
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe
cursor/deeply nested tree (empty -> full -> empty)
                        time:   [1.3616 µs 1.3631 µs 1.3649 µs]
                        thrpt:  [33.703 Melem/s 33.747 Melem/s 33.783 Melem/s]
                 change:
                        time:   [-40.063% -39.675% -39.234%] (p = 0.00 < 0.05)
                        thrpt:  [+64.566% +65.769% +66.843%]
                        Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
  18 (18.00%) high mild
  2 (2.00%) high severe
```
Create a tree editor using `Tree::edit()` or `Repository::edit_tree(id)`.
@Byron Byron force-pushed the merge branch 2 times, most recently from 958edef to e738acc Compare September 5, 2024 17:46
An implementation of `Header`, `Write` and `Find`, that can optionally
write everything to an in-memory store, and if enabled, also read
objects back from there.

That way it can present a consistent view to objects from two locations.
The default object database changed to a version that allows to
keep objects in memory. This needs a mutable `Repository` instance
to setup.
@Byron Byron marked this pull request as ready for review September 5, 2024 20:33
@Byron Byron enabled auto-merge September 5, 2024 20:34
@Byron Byron merged commit d69c617 into main Sep 5, 2024
15 checks passed
@Byron Byron deleted the merge branch September 5, 2024 20:48
@Byron Byron mentioned this pull request Sep 7, 2024
26 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants