Skip to content

Fuzz testing tidy-html5 #591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mthorpe7 opened this issue Aug 29, 2017 · 6 comments
Closed

Fuzz testing tidy-html5 #591

mthorpe7 opened this issue Aug 29, 2017 · 6 comments
Labels
Milestone

Comments

@mthorpe7
Copy link
Contributor

I've been fuzz testing tidy with AFL recently, and come across a plethora of crashes/hangs (mostly revolving around invalid UTF8). I intend to fix these crashes, but I suspect each fix will be a different code change.

How would you like these fixes to be upstreamed? A single PR with a single commit, a single PR with many commits or many PRs with single commits?

@mthorpe7
Copy link
Contributor Author

sample crash:

$ cat out/fuzzer03/crashes/id:000052,sig:11,src:000642,op:havoc,rep:2
<html xmlns:o="urn:schemas-minrosoft-com:office:office"X<?ody><table><img class="" /></table></body></html>
$ ./tidy-html5/tidy --force-output yes --show-warnings yes --tidy-mark no --word-2000 yes out/fuzzer03/crashes/id:000052,sig:11,src:000642,op:havoc,rep:2
line 1 column 1 - Warning: missing <!DOCTYPE> declaration
line 2 column 1 - Warning: inserting missing 'title' element
line 1 column 1 - Warning: <html> proprietary attribute "xmlns:o"
line 1 column 1 - Warning: <html> proprietary attribute "x"
[1]    41695 segmentation fault  ./tidy-html5/tidy --force-output yes --show-warnings yes --tidy-mark no  yes

@balthisar
Copy link
Member

Thanks for taking this on, @mthorpe7!

How would you like these fixes to be upstreamed? A single PR with a single commit, a single PR with many commits or many PRs with single commits?

It would be best to tie commits to specific, fixed test cases. If you have one large test case, then a single commit might be okay, but it depends on how "invasive" the fixes become. A single PR is okay, and commits that each resolve a specific issue are appreciated. If you prefer separate PR's, then that's fine, too.

@geoffmcl generally likes smaller PR's, and although he's on holiday now, he's generally much more consistent in attending PR's than I am (I often disappear for months at a time due to other commitments).

@geoffmcl
Copy link
Contributor

geoffmcl commented Sep 1, 2017

@mthorpe7 thanks for reporting, and as @balthisar says, PR's are very welcome...

Certainly, just trying your simple sample with the --word-2000 yes option does indeed segfault...

<html xmlns:o="urn:schemas-minrosoft-com:office:office"X<?ody><table><img class="" /></table></body></html>

That particular problem can be avoided with the small patch -

diff --git a/src/clean.c b/src/clean.c
index 707e4d9..fc5ca31 100644
--- a/src/clean.c
+++ b/src/clean.c
@@ -1902,7 +1902,8 @@ void TY_(CleanWord2000)( TidyDocImpl* doc, Node *node)
             attval = node->attributes;
             while ( attval ) {
                 next_attr = attval->next;
-                if ( strcmp(attval->attribute, "xmlns") != 0 )
+                /* Issue #591 - take care of a NULL attribute */
+                if ( !attval->attribute || ( strcmp(attval->attribute, "xmlns") != 0 ))
                     TY_(ReportAttrError)(doc, node, attval, PROPRIETARY_ATTRIBUTE);
                 attval = next_attr;
             }

And also please checkout the patch proposed in #588. This was certainly one case where invalid utf-8 could cause a problem, hopefully fixed by that patch...

As @balthisar mentioned, I am still on vacation until mid Sept, so do not have time now, but look forward to your efforts to make tidy crash/hang free... thanks...

@geoffmcl geoffmcl added the Bug label Sep 1, 2017
@geoffmcl geoffmcl added this to the 5.5 milestone Sep 1, 2017
@geoffmcl
Copy link
Contributor

Has anyone had a chance to test my suggested patch for at least the one word-2000 issue? thanks...

And @mthorpe7 you suggest a a plethora of crashes/hangs... can you enumerate, give specific examples... thanks

balthisar added a commit that referenced this issue Sep 29, 2017
  - Apply @geoffmcl's patches and tested.
@balthisar
Copy link
Member

Added #622 with @geoffmcl's patch, which works nicely.

@balthisar
Copy link
Member

Merged. @mthorpe7, since no support is required for your original request, I will close this. Certainly you are free to submit any type of PR's, and those will open new message threads as they happen. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants