-
Notifications
You must be signed in to change notification settings - Fork 429
Regression in 5.1.32 #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@serjepatoff, I'll track it down when I can. Thanks for the report. |
@serjepatoff, yes thanks for the report... took me some time to set up a repeatable test using a 5.1.14 exe... compare the outputs... etc... And can confirm there is a change, even a loss, in output, that happened some time after 5.1.14... and is still there in the current 5.1.33Exp... will try to narrow that down... maybe by building successives versions, commit by commit, to see when this specifically occurred... It would be really good if you could trim down the contents of say This would really assist in debugging... helps narrow down the what exactly is going wrong... About that time there was quite a change, see #65, but may not be related. Just to be sure could you try with Will try to set up the same test, and other tests... and get back soonest... |
@serjepatoff, have narrowed it down to a change after 5.1.24, commit 496c81c, Date: Wed Nov 18 20:02:54 2015 +0100 That is to say, it appears Tidy 5.1.24 outputs the same as 5.1.14... Tidy version 5.1.25, commit 2388fb0, starts to be different... and it looks like there were two, or more, stages in the changes... You can checkout a source for each commit, like this -
This gets the source at that moment in time, 5.1.24 in this case, and you can check the version.txt file, and build a tidy for testing... I am certainly developing a collect of versions - 5.1.14, 5.1.17, 5.1.19, 5.1.22, 5.1.23, 5.1.24, 5.1.25, 5.1.27, 5.1.33Exp ;=)) Maybe something to do with Issue #307, #167, #169 - regression of nestd anchors - your sample 1 document certainly appears to have some anchors not closed... but the hrefs are large and hard to follow... Will look deeper as time permits, but maybe this will assist in tracking this down... and putting together a simple sample for debug... |
@serjepatoff, yes, it had everything to do with the above patches, which began allowing mixed content in an anchor, as per the HTML5 spec... So your first sample does not have a doctype, so tidy would stay in html5 mode... and treat the page as html5... But your config does set the doctype - So it is a simple case, if no doctype, and the TidyDoctypeMode is not 'auto', the default, nor 'html5', then flip tidy back into html4-- mode... In this mode, tidy will again force an anchor closed as soon as it meets another element, not allowed in an anchor... and we are back to 5.1.14 anchor like parsing... The simple patch so far is -
If you are testing with command line tidy, then you need to add say the command I have still to test it on your second sample, but expect it to work... I will get around to adding that patch to the repo, but that may be a few days, unless someone beats me to it with a PR ;=)) It would be appreciated if you could add this patch, compile from source, test and report... thanks... |
Oops, that patch is against the To get it -
|
@serjepatoff eventually got around to pushing this to Would be appreciated if you could pull master 5.1.35 plus, and test... And if this fixes this issue close it... thanks... |
No further comment for over a week, so closing this for now... Please fell free to reopen, or raise another issue... thanks... |
Thank you very much Geoff. And sorry for absence. We had no time to test the fixes because of tight schedule of our product release, so we lived with stable 4.9.10 until now. Now we have a little time and I'm starting to test new Tidy. |
I've just integrated 5.1.36issue-373(9cf97d5), checked up many samples and everything became OK, except for one html where tidying breaks table layout. |
@serjepatoff no problem, and thank you for retesting... I have now updated the Concerning your zip, I assume there was some mistake in the Then to match your config used a config file below. Note commented out the tidy-mark since I wanted to be sure I was using the correct
And yes something is wrong with the second? table width, and the final text is right aligned, so have re-opened this issue... Will see if I can find what is wrong... but it would be appreciated if you could update to this latest and re-check your samples are all ok, except the case you posted... thanks... |
Geoff, there is no mistake in doc1_before.html It is real-world case found "in the wild". As far as I can guess, it is 1x1 pseudo image used to track html email opening. |
@serjepatoff, did you mean to close this? This means you agree Tidy is behaving correctly? |
Hello,
We have found a regression that shows up when processing some htmls. When using version 5.1.32 (commit 9942856) content is corrupted. In ealier version 5.1.14 (commit 7e69ceb) everything is OK.
Options used are as follows:
There are 2 html content samples in attachment. For each sample there are
(1) original html,
(2) html after tidying with 5.1.14,
(3) html after tidying with 5.1.32,
(4) screenshot of content after tidying with 5.1.14 in iOS UIWebView,
(5) screenshot of content after tidying with 5.1.32 in iOS UIWebView.
Here is tree listing of files in attachment, hope that naming would be self-explanatory to you.
Attachment.zip
Thank you very much for your attention.
The text was updated successfully, but these errors were encountered: