-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Markdown comparison includes headings unnecessarily #452
Comments
Interestingly, if I paste the same content into http://incaseofstairs.com/jsdiff/, I cannot replicate the issue.
|
@kpdecker does http://incaseofstairs.com/jsdiff/ do something different than what's in my code? |
oh, it is not using sentences. It uses words diff. I will evaluate that. |
Looks like switching to words algorithm comes with tradeoffs though. it.only('recognizes different sentences', () => {
const a = 'This is a sentence. This is another sentence.';
const b = 'This is a sentence. And this is a different sen.';
const abDiff = diff(a, b);
expect(abDiff).toStrictEqual([
{
count: 17,
value: 'This is a sentence. This is ',
},
{
added: true,
count: 9,
removed: undefined,
value: 'a different',
},
{
count: 9,
value: ' sentence.',
},
]);
}); This produces a fairly unreadable diff. [
{ count: 9, value: 'This is a sentence. ' },
{ count: 1, added: undefined, removed: true, value: 'This' },
{ count: 1, added: true, removed: undefined, value: 'And' },
{ count: 1, value: ' ' },
{ count: 2, added: true, removed: undefined, value: 'this ' },
{ count: 2, value: 'is ' },
{ count: 1, added: undefined, removed: true, value: 'another' },
{ count: 1, added: true, removed: undefined, value: 'a' },
{ count: 1, value: ' ' },
{ count: 1, added: undefined, removed: true, value: 'sentence' },
{ count: 3, added: true, removed: undefined, value: 'different sen' },
{ count: 1, value: '.' }
] |
Based on my tests https://github.com/google/diff-match-patch is producing the best results when used with |
Hm. That won't work as easy as I thought it would. It self-references the constructor and bunch of prototype functions. It would be nice if jsdiff supported equivalent logic as that package does not appear to be actively maintained. |
We could make newlines be treated as a sentence break, but that wouldn't actually give you what you want here, because Markdown lets you hard-wrap paragraphs with newlines that have no effect on the rendered result. Nor would treating only double-newlines as sentence breaks work, since a heading can be terminated with a single newline. The only solution is a tokenizer that recognises the start of a heading (or perhaps also a bullet point or numbered list item, if you want to treat list items as always being distinct sentences). I thus don't really see a possible change to jsdiff that would give you what you want without also baking an assumption into
|
Produces:
It is not clear why
## AIMD generated content\n
is marked as removed.The text was updated successfully, but these errors were encountered: