Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mergetool p4merge #174

Closed
wants to merge 2 commits into from
Closed

Fix mergetool p4merge #174

wants to merge 2 commits into from

Conversation

calle2010
Copy link

The merge tool p4merge fails in certain cases when there is no base version for a 3-way merge. In this case create_virtual_base is called to create a base file which contains only the common content of the files to be merged. This base file is presented in p4merge.

With this script https://gist.github.com/calle2010/b4910e9b8f0dbbf58fb6 I can replicate the issues in a repository with core.autocrlf=true.

There are two issues:

  1. Since commit c536c07 git-apply fails with invalid path if the path starts with a dot.
  2. If the files have CRLF line endings these will be included in the output of diff. This leads to error messages about trailing whitespace in git-apply.

Even though the patch resolves the issues I have some doubts about this patch:

  • Is the parameter expansion with # allowed in a Git bash script?
  • create_virtual_base seems to be used by mergetools/p4merge and git-merge-one-file.sh only. The latter is described as an sample script in the documentation of git-merge-index. So I hope I'm not in the risk of breaking anything very central here.

@dscho
Copy link
Member

dscho commented Jun 9, 2015

Nice Pull Request!

Is the parameter expansion with # allowed in a Git bash script?

Yes, it is.

create_virtual_base seems to be used by mergetools/p4merge and git-merge-one-file.sh only. The latter is described as an sample script in the documentation of git-merge-index. So I hope I'm not in the risk of breaking anything very central here

This one is more tricky: git-merge-one-file is actually used in git merge-octopus and git merge-resolve. While the latter is not used widely anymore (at least that I know of), the former is used by some.

Also, the --strip-trailing-cr option seems to be specific to GNU diff. While we have the luxury in Git for Windows to expect diff.exe to be of the GNU variety, the same does not hold true everywhere. For me, though, the bigger question is why the files that are passed into create_virtual_base do have carriage returns when they should not have them according to git apply. Maybe p4-merge should make sure that the input files do not have carriage returns?

Or is this the issue where working directory files differ in line ending convention from the files in the repository? In that case, I imagine that we would want p4-merge to apply the working directory convention when writing out the file contents from the repository?

@calle2010
Copy link
Author

Thanks for your comments. I wasn't aware of the Coding Guidelines.

Indeed --strip-trailing-cr seems to be supported by less implementations than the unified format (-u), e. g. OpenBSD seems to ship with a version of diff that supports -u, but not the stripping of CR.

To give a background of this issue:
When the files in the working directory have CRLF line endings the shipped diff utility will output a mixed file: The lines generated by diff end with LF while the lines from the compared files carry over the CRLF from the input.

I'm not fond of any kind of sed scripts to get rid of the trailing CRs. Also dos2unix won't be available on all installations.

I wonder if it makes sense to ditch the diff utility (that you started to ship as solution to issue #163) and use something like git diff --no-index -- "$1" "$2" | git apply --no-add instead. In the remaining lines of the function I'd have to use $2 instead of $1 to get the common lines. If this solves the issue and if it works I have to try. I can't do this before the weekend, though.

@dscho
Copy link
Member

dscho commented Jun 10, 2015

I'm not fond of any kind of sed scripts to get rid of the trailing CRs. Also dos2unix won't be available on all installations.

My preference would have been | tr -d '\r'.

I wonder if it makes sense to ditch the diff utility (that you started to ship as solution to issue #163) and use something like git diff --no-index -- "$1" "$2" | git apply --no-add instead. In the remaining lines of the function I'd have to use $2 instead of $1 to get the common lines.

I actually had looked into that already myself because I wanted to get rid of the dependency on the diff(1) utility (because of the incompatibilities you pointed out).

The only problem is that Git's diff command does not understand the -L<name> arguments, required for git apply to patch the correct files.

Let's see, what would it take to implement that -L option... goes-looking-at-the-code 👀

Okay, I think I have a good plan to implement the -L option now. First of all, let's note that it only makes sense in the --no-index case, i.e. when Git's diff compares files to files without looking at Git's database at all. This means that the documentation patch for Documentation/diff-options.txt should mention that this only works in conjunction with --no-index.

As for the implementation: the parsing needs to be done in

https://github.com/git/git/blob/v2.4.3/diff-no-index.c#L254

I guess. As the --no-index option is not mentioned in the usage anyway, I do not think that we need to do anything about listing the -L<name> option, either.

If I were to implement the -L feature, I would initialize

const char *labels[2] = { NULL, NULL };
int label_counter = 0;

and then parse (using const char *arg; ... else if (skip_prefix(argv[i], "-L", &arg))) into that array, taking care to error out if more than 2 labels were provided.

Then the queue_diff() and noindex_filespec() functions' signatures need to be extended with two and one label parameter, respectively and noindex_filespec() needs to override the file name with the label if it is not NULL.

@calle2010 How about it, want to give it a shot?

@calle2010
Copy link
Author

Hmm, my last encounter with C was 17 years back. At least I can try first if git diff really handles this case better. Otherwise it would not make sense to even start this enhancement.

@calle2010
Copy link
Author

I verified that git diff has only LF in the output even if the input files have CRLF line endings. So the change would make sense.

@calle2010
Copy link
Author

I tried it a described, but it is not so easy. If noindex_filespec() just overrides the name with the label the diff algorithm still tries to read the file from the label path in diff_fill_sha1_info().

So I think there are potentiality two options now:

  1. enhance struct diff_filespec with a label member and change the right place in diff.c to print the label instead of the path
  2. do something similar to the stdin handling in diff-no-index and read the files upfront.

ad 2. I tried this at end of noindex_filespec

if (label) {
    diff_populate_filespec(s, 0);
    strcpy(s->path, label);
    s->sha_valid1 = 1;
}

and it nearly works. But if the two labels are the same then no diff is printed due to the checks in diff_unmodified_pair().

I feel like option 1. is too much for this. Is it really required that create_virtual_base leaves the second file untouched? If not a much easier solution could be implemented there without enhancing git diff at all.

@calle2010
Copy link
Author

So I found a solution to the last issue, but there is still some to do:

  • support --label for the sake of compatibility?
  • documentation
  • tests
  • run test suite

@calle2010
Copy link
Author

I could't spend much time on this unfortunately. I'd like to know if I'm on the right path.

My biggest concern with this change is that the first line of the generated patch contains the label instead of the actual file name. This is different to the behavior in the normal diff command. In order to change this I'd have to enhance struct diff_filespec so that it can hold label and path.

Also tests for directory mode have to be added.

@dscho
Copy link
Member

dscho commented Jul 14, 2015

The progress looks good! I have a hunch that it might be a wise course to submit this to the Git mailing list already; there are most likely people with a good idea how to hand down the label to the diff machinery such that it is not automatically prefixed by "a/" or "b/".

@dscho
Copy link
Member

dscho commented Jul 14, 2015

(Unless you are uncomfortable with the mailing list-based work flow, of course...)

@dscho
Copy link
Member

dscho commented Aug 17, 2015

@calle2010 did you get any chance to work on this a bit more? As I am preparing the release, I stumbled over this ticket; It is probably the best to not rush it out the door and take care of it properly after the release.

@calle2010
Copy link
Author

No, I haven't had time to continue with this so far. It will probably not be different for the coming three weeks.

@dscho dscho modified the milestone: Post-release cleanup Aug 24, 2015
Since commit c536c07 git apply (called in create_virtual_base)
doesn't accept path names with leading dots anymore and fails with
"invalid path" instead.

Signed-off-by: Christian Schaefer <chr.j.schaefer@web.de>
- parse -L label option for git diff --no-index
- make use of git diff for create_virtual_base
- new test t4059-diff-label.sh
- open issues:
  - test fails since implementation in git is not correct:
    label and filenames on first line can't be different
  - test dir mode
  - first line of created patch contains the label instead of the actual
    file name
@dscho
Copy link
Member

dscho commented Mar 20, 2018

@calle2010 how about now?

@dscho
Copy link
Member

dscho commented Mar 23, 2018

/remind me to close this (unless a princess kisses it awake in the meantime) in two weeks

@dscho dscho added the reminder label Mar 23, 2018
@dscho dscho closed this Nov 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants