Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible diff algorithm improvement #46

Closed
maliayas opened this issue Apr 10, 2016 · 3 comments
Closed

Possible diff algorithm improvement #46

maliayas opened this issue Apr 10, 2016 · 3 comments

Comments

@maliayas
Copy link

I'm using the "Override Demo 5" in the demo and I get this result:

ekran goruntusu 2016 04 10 23_34_34

Another diff app I try gives this result for the same HTML:

ekran goruntusu 2016 04 10 23_34_46

Note how it handles the first paragraph. I don't know how complex it is to implement but it's a better algorithm for this example HTML. Thanks in advance.

@jschroed91
Copy link
Member

@maliayas Thanks for opening this issue!

This is actually an undesired side-effect of one of the newer features we implemented - isolated tag diffing. I had not actually thought about this until now, so I'm very glad you opened this issue.

Basically, the isolated tag diffing is comparing the italic tags emergency escape and rescue openings separately from the rest of the content - this is to fix a lot of the issues we had with the diff output not protecting the HTML structure.
The issue here is that in order to diff them in isolated, we actually replace the entire tag with a placeholder "word" before we diff the content. The diffing algorithm is not aware of the length of the string that the placeholder represents, and therefore sees it as 1 word, and in this case it is finding a longer match in shall be than it is in the placeholder match.

So, it will take a little of work, but is certainly possible. This will be one of the higher priorities to tackle.

@maliayas
Copy link
Author

I see. Great explanation. If fixing this, will break other stuff, don't worry about this issue. I understand that perfecting a diff library may be quite complex.

Btw. demo tool is awesome.

@jschroed91
Copy link
Member

Looping back around here - our highest priority of this library was the accuracy of the diff, so unfortunately performance took a back seat to it. However, we do like to leave that decision up to the end users when we can - the config option setIsolatedDiffTags is used to define which tags are diffed in isolation, and currently the defaults include i and em tags.

I'll see if I can update the documentation to highlight the reasoning behind choosing this as the default option this weekend.

Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants