-
Notifications
You must be signed in to change notification settings - Fork 429
Whitespace removed after <a>...</a> #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This patch fixes it: https://github.com/thieso2/tidy-html5/commit/0ef710fe90119e204f01adc6a9fc3635b5ab15cb It doesn't seem to break which introduces the regression. If only I knew how to add testcases to this project I'd write a test and send a pull-request! |
Patch working fine.Thanks! |
I cherry-picked the change and pushed it to the w3c/tidy-html5. I would like to add testcases too but this whole thing's experimental at this point. It somebody feels really strongly that we shouldn't be making code changes without adding test cases, there's nothing stopping anybody from volunteering to go through the changelog and write up tests. |
See a3d49a7#commitcomment-1143626 I need to back out the change. The markup fix for this is to put the element inside something other than or or whatever.
|
Note that you will see this same behavior with It's a consequence of the fact that in HTML5, the We'll have to figure out a better fix for this later, but for the time being, the workaround is to not have the |
The root cause of this is in the part of the parser code that parses the contents of the body element: https://github.com/w3c/tidy-html5/blob/master/src/parser.c#L3291 |
So the specific cause for this is at line 3528 of parser.c: https://github.com/w3c/tidy-html5/blob/master/src/parser.c#L3528
Removing the I don't really understand why that condition is there, but it seems it might relate to the comment just after it:
...which is true: in HTML4 strict, I could try removing the Will need to think more about what to do for this. I guess introducing a new option might be one choice. |
Tried a few more cases and I still really can't see what was the original intent of having the code ignore whitespace after these CM_MIXED elements. So I'm going ahead to check in this change. If this breaks anything, it will be restricted to the ins, del, a, script, and noscript elements. |
Current matster (4ff3234) removes white space following <a>...</a>. This HTML
is cleaned up to
Notice that the single white space before "two" is missing in the output.
This is a regression from the original Tidy from SourceForge.
The text was updated successfully, but these errors were encountered: