Skip to content

TidyDoctypeMode option documentation missing #472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ralfjunker opened this issue Dec 15, 2016 · 7 comments
Closed

TidyDoctypeMode option documentation missing #472

ralfjunker opened this issue Dec 15, 2016 · 7 comments

Comments

@ralfjunker
Copy link

The TidyDoctypeMode option is not documented. Documentation is missing both online at http://api.html-tidy.org/tidy/quickref_5.2.0.html#doctype as well as built into libtidy.

There is just a short comment in tidyenum.h which asks to "See doctype property". Unfortunately, TidyDoctype does not mention TidyDoctypeMode.

In particular,

  • why are there two similarly named properties and
  • what is the difference between them?

Could you please add the missing TidyDoctypeMode documentation?

@ralfjunker ralfjunker changed the title TidyDoctypeMode option documentation missing TidyDoctypeMode option documentation missing Dec 15, 2016
@geoffmcl
Copy link
Contributor

@ralfjunker wow, you raise an interesting question or 2, which leads to a potential bug... thanks...

As far as I can see, there is no actual option, TidyDoctypeMode, aka --doctype-mode ????, and will lead to a crash if used, so there should be no documentation of it! It is only used internally, in the library...

And there should be a fix to avoid that crash... In the option_defs[] config table, it will be noted that it is the option with a NULL parser, except of course for the first and last entries...

Perhaps the tidyenum.h comment should be something like - "Not a valid otpion! Only used by 'doctype' property" - or something...

To try to explain more. This particular --doctype enum option has two forms -

  1. as an enumerated int value, and
  2. in one case a user string

And as @jidanni points out in #435, all this is not well explained in - http://api.html-tidy.org/tidy/quickref_5.2.0.html#doctype! Nor, as also pointed out, completely implemented, in that a user string type seems ignored...

I think because of this dual option type the actual enumerated int is kept in the TidyDoctypeMode line in the config table, but if that is set to TidyDoctypeUser, then the user's string will be kept in the TidyDoctype line... certainly a bit confusing, for all...

So first, the simple bug I see. If a person enters an option --doctype-mode <value>, the first part of the config parser will find it, and see it as a valid option, but will blow up when the NULL option->parser is called. Note the following quick patch would solve that -

diff --git a/src/config.c b/src/config.c
index ddb677c..343b337 100644
--- a/src/config.c
+++ b/src/config.c
@@ -924,8 +924,12 @@ Bool TY_(ParseConfigValue)( TidyDocImpl* doc, TidyOptionId optId, ctmbstr optval
     const TidyOptionImpl* option = option_defs + optId;
     Bool status = ( optId < N_TIDY_OPTIONS && optval != NULL );
 
-    if ( !status )
-        TY_(ReportBadArgument)( doc, option->name );
+    if (!status || !option->parser) {
+        /* Issue #472 - report a 'bad' option name,
+           *or* if the option has a NULL parser.
+           TODO: Add error message for a NULL parser. */
+        TY_(ReportBadArgument)(doc, option->name);
+    }
     else
     {
         TidyBuffer inbuf;            /* Set up input source */

And note, the actual error message needs to be added for this NULL parser case. One could argue that they should not be using an option not documented - just what they found in the source. But on the other hand tidy should not crash in any case...

And further note, this area of the config code is also related to #468, where I am waiting for some user's with non-ascii file name test the fix in branch issue-468, to fold this in...

Need to explore more on this, and would appreciate any assistance, including the documentation issue, and implementation, mentioned in #435... thanks...

@ralfjunker
Copy link
Author

Interesting findings. However, ParseConfigValue should return success / failure. How about the following fix:

Bool TY_(ParseConfigValue)( TidyDocImpl* doc, TidyOptionId optId, ctmbstr optval )
{
    const TidyOptionImpl* option;
    Bool status = (
      optId < N_TIDY_OPTIONS &&
      (option = option_defs + optId)->parser &&
      optval != NULL );

    if ( !status )
        TY_(ReportBadArgument)( doc, option->name );
    else
    {
        TidyBuffer inbuf;            /* Set up input source */

Note that the TidyUnknownOption ("unknown!') also triggers the problem - and the above fixes also solve it.

@geoffmcl
Copy link
Contributor

@ralfjunker, yes I think your patch reads better in that is also preserves the return failure...

Also see #473 for mention of another option, gnu-emacs-file as another 'internal' only option, that, if possible, should not appear in user docs... or at least only in developer docs...

Still to get to testing this... thanks...

@balthisar
Copy link
Member

Please test #509 so we can close.

@ralfjunker
Copy link
Author

#509 works fine, but buffer declaration seems more complex than needed and unlike other tmbsnprintf() buffer uses in this library. I propose this simplification to config.c:

 Bool TY_(ParseConfigValue)( TidyDocImpl* doc, TidyOptionId optId, ctmbstr optval )
 {
     const TidyOptionImpl* option = NULL;
-    enum { sizeBuf = 11 };
-    char buffer[sizeBuf];
     /* #472: fail status if there is a NULL parser. @ralfjunker */
     Bool status = ( optId < N_TIDY_OPTIONS
                    && (option = option_defs + optId)->parser
                    && optval != NULL );
 
     if ( !status )
         if ( option )
             TY_(ReportBadArgument)(doc, option->name);
         else
         {
             /* If optId < N_TIDY_OPTIONS then option remains unassigned,
                and we have to fall back to an ugly error message. */
-            TY_(tmbsnprintf(buffer, sizeBuf, "%u", optId));
-            TY_(ReportUnknownOption(doc, buffer));
+            char buf[11];
+            TY_(tmbsnprintf(buf, sizeof(buf), "%u", optId));
+            TY_(ReportUnknownOption(doc, buf));
         }

@balthisar
Copy link
Member

balthisar commented Mar 9, 2017

@ralfjunker, that's actually how I'd prefer to do it, but I see that Tidy's existing code uses the enum pretty pervasively elsewhere. And we still support K&R as far as I can tell (or at the latest, C90), and I always have a hard time remembering where I can declare new variables without breaking other people's compilers.

Refresh my memory, and I'll clean it up ;-)

@balthisar
Copy link
Member

Reminder to self: C89, so no issues with context. Updated per @ralfjunker's suggestion, merged, and closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants