Closed
Description
The field DiffLine.content
contains a unicode line. As git does not know anything about the encoding of the files to be diffed (they are blobs), I expect this object to be of type str
in py2 and bytes
in py3.
Even worse if a file is i.e. latin-1 encoded and contains latin-1 specific characters, all these characters are mapped to '\xfffd'. Thus is impossible to diff non-ascii encoded text files correctly.
I suppose this is a pygit2 bug, as the libgit2.h interface works correctly, as it exposes this field as const char *
(see https://github.com/libgit2/libgit2/blob/HEAD/include/git2/diff.h#L555)