-
Notifications
You must be signed in to change notification settings - Fork 429
Infinite loop parsing in tidy #380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@gaa-cifasis thanks for your continued support... tidy needs this... keep em coming ;=)) tidy loves your support! Unfortunately at this stage we can not go back to fix the 5.1.25 release! But I can see the problem with release 5.1.25... It does get locked in an infinite loop! Very BAD!! Hopefully this will be addressed in future releases, where there should be a branch for each release... and thus we can retest, and push back later important fixes to such a release, bumping the release number... read README/VERSION.md for the general idea. But 5.1.25 was an odd man out! No branch was created! But testing your fragment with the latest Test case input: input5\in_380.html
Config: Output:
Message output:
Even adding Please re-test with the latest, always Again, thanks for your support... tidy need this type of testing... at this stage I can only mark this as Technical Support, with an indefinite future... Find a repeatable problem case, with the latest, and this will quickly change... thanks... |
Hi, You can find a test case of an infinite loop to reproduce in the last revision here. |
@gaa-cifasis wow, thanks I think ;=)) It is certainly a weird document, with SUB and NUL chars, but yes it seems to repeat forever... Will look at it soonest... unless someone beats me to it with a patch or PR... thanks... |
Added more debug code to try to track this bug!
@gaa-cifasis, first I have added a lot a MSVC debug code to But unfortunately this debug is only available with a Windows MSVC compiler, and does not exist in the Release. I must do something about that one day... make it available with other compilers... it is very helpful tracking down where, and when... I have narrowed it down where the problem starts... As suspected it is nothing to do with the odd character values in your file. As previously mentioned tidy does just drop all these... It is the sequence of events, and have found a relatively simple html sample, my in_380-3.html, that can trigger it, and still trying to reduce that sample even more... Have found a patch, but it chops some code added in 2004, so I am not sure this is the full answer... or what all the consequences of that are... It does not seem present in the last CVS release in 2009, nor in tidy-4.9.13, circa 08/02/2015, so something changed, added, modified, since then, and am trying to work backward to see what that is/was... a slow painful process... This is just an update... moving forward slowly... thanks for this interesting bug ;=)) |
Added more debug code to try to track this bug!
@gaa-cifasis have found this occurred from 4.9.17 to 4.9.18. Here commit 86f626c reverted the anchor tag to just So now I have identified exactly when a certain set of events can cause this infinite loop, but that does not provide a simple solution. As mentioned I have found a patch, created an Will now explore that further... moving forward... It would be much appreciated if you get a chance to checkout and test that patch, |
@gaa-cifasis after adding in the patch for #379 have bumped the version to 5.1.45-Exp2 in the Everyone's help in fully testing this branch would be most appreciated... thanks... |
@gaa-cifasis this is now in Feel free to re-open, or file a new issue... thanks... |
If you try to parse a small fragment of HTML code, tidy will loop. It was tested with release 5.1.25. For instance:
A similar example was discovered in revision 03a643f.
The text was updated successfully, but these errors were encountered: