Skip to content

Still seeing the warning "<table> lacks "summary" attribute" with HTML5 #377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
a3nm opened this issue Feb 20, 2016 · 31 comments
Closed

Still seeing the warning "<table> lacks "summary" attribute" with HTML5 #377

a3nm opened this issue Feb 20, 2016 · 31 comments
Milestone

Comments

@a3nm
Copy link

a3nm commented Feb 20, 2016

This is probably related to #210.

With the latest version from trunk and the following document:

<!DOCTYPE html>
<html dir="ltr" xml:lang="en" lang="en"><head>
  <title>blah</title>
</head>
<body>
  <table>
    <tr><td>foo</td></tr>
  </table>
</body></html>

I get the following warning when running /usr/local/bin/tidy -utf8 -errors -quiet:

line 6 column 3 - Warning: <table> lacks "summary" attribute

This HTML5 document is accepted by validator.w3.org without issues so I don't think this warning should be emitted.

@geoffmcl
Copy link
Contributor

@a3nm thanks for the report, with sample html - always appreciated...

Yes, in #210 we suppressed that warning for HTML5, but tidy is viewing this document as XHTML5... replace the <html dir="ltr" xml:lang="en" lang="en"> in your sample with just <html> and no warning will be emitted...

Looks like a simple fix... will get to it soonest, unless you, or others beat me to it with a patch, or PR... thanks..

@geoffmcl geoffmcl added the Bug label Feb 21, 2016
@geoffmcl geoffmcl added this to the 5.2 milestone Feb 21, 2016
@a3nm
Copy link
Author

a3nm commented Feb 21, 2016

Thanks for your answer!

I confirm that with just <html> no warning is emitted. I also checked that lang and xml:lang are valid in HTML 5 https://www.w3.org/TR/html5/dom.html#the-lang-and-xml:lang-attributes and dir also is https://www.w3.org/TR/html5/dom.html#the-dir-attribute. So I think no warning should be emitted.

Thanks if you can fix this! :)

@geoffmcl
Copy link
Contributor

@a3nm as expected, tidy was only checking for HTML5 (HT50), not also XHTML5 (XH50)...

Have applied a small patch to master, version 5.1.44 onwards, to fix this case.

Could have also tested the version using VERS_HTML5 (HT50|XH50), but HTMLVersion(doc) is expected to only return a single version value, so I think this this fix is clearer in its intention.

But in reviewing other code I note several other places where we are only testing for the HTML5 version, but have left these as a new issues, as and when they come up. At that time maybe need a new service like IsHTML5Version(doc), in addition to HTMLVersion(doc), to generalise this.

Appreciated if you could pull, build and test this, and close this issue if satisfactory... thanks...

@a3nm
Copy link
Author

a3nm commented Feb 29, 2016

I tried to pull and re-build, and I'm getting exactly the same error as before with the example document in my initial bug report. Is it working for you?

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 1, 2016

@a3nm yes, it is working for me!

Are you sure you pulled and built the master branch. Make sure it contains commit 7bdc31a...

Using an input: input5\in_377.html

<!DOCTYPE html>
<html dir="ltr" xml:lang="en" lang="en">
<head>
  <meta charset="utf-8">
  <title>Issue #377</title>
</head>
<body>
  <table>
    <tr><td>foo</td></tr>
  </table>
</body>
</html>

Running tidy with a configuration -utf8 -i, got the great output:

Info: Document content looks like XHTML5
No warnings or errors were found.

<!DOCTYPE html>
<html dir="ltr" xml:lang="en" lang="en" xmlns=
"http://www.w3.org/1999/xhtml">
<head>
  <meta name="generator" content=
  "HTML Tidy for HTML5 for Windows version 5.1.44" />
  <meta charset="utf-8" />
  <title>Issue #377</title>
</head>
<body>
  <table>
    <tr>
      <td>foo</td>
    </tr>
  </table>
</body>
</html>

... blah blah blah - this blah can be suppressed with '--show-info no' 
    or '-q' to also drop the first lines...

You can see the version I am using, 5.1.44. Note, adding a config of -utf8 does not change anything since that is the default. If I add -errors -quiet I get no output, as expected.

Please carefully check the version of tidy, tidy -v. If I backup even one version, of course it is still there -

F:\Projects\tidy-test\test>tidy5-5.1.43 -errors -quiet input5\in_377.html
line 8 column 3 - Warning: <table> lacks "summary" attribute

Can only suggest you are using an earlier version still!

@a3nm
Copy link
Author

a3nm commented Mar 1, 2016

I confirm that I am running version 5.1.44:

root@7d1010c25c24:~# /usr/local/bin/tidy --version                           
HTML Tidy for Linux version 5.1.44

With the document in your message, I do not get any error, but note that it is not the same as my original document as you added a meta tag. When I remove the meta tag from the document that you posted in your last message, I get the spurious warning.

My understanding is that the warning should not be emitted, even without the meta tag.

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 1, 2016

@a3nm thanks for the quick reply. Ok, we are now fine on the version ;=))

Hmmm, I added that meta tag because the W3C validator gave a warning about no charset... and it also pushed tidy to say -

Info: Document content looks like XHTML5

Now without that meta tag tidy will take the document as XHTML 1.0 strict, and will thus emit the table summary warning...

Info: Document content looks like XHTML 1.0 Strict

The only change I made was to not emit it for HTML5 or XHTML5...

So now the question is what is the W3C say on the summary attribute on a table for XHTML 1.0 strict? Is it a spurious warning?

Need to research that... appreciate any links you find... thanks...

@a3nm
Copy link
Author

a3nm commented Mar 1, 2016

From the XHTML 1.0 Strict DTD https://www.w3.org/TR/xhtml1/dtds.html#a_dtd_XHTML-1.0-Strict which appears to be what defines conformity to XHTML https://www.w3.org/TR/xhtml1/#docconf I observe that the "summary" attriibute of the "table" element is not mandatory (it is IMPLIED, not REQUIRED), so I don't think that a warning should be emitted for XHTML 1.0 Strict either.

As for the question of the meta charset tag, I do not see anything suggesting that the tag is required with HTML5 (see e.g. http://stackoverflow.com/a/14669441/) so I do not see why omitting the tag should cause anything specific to happen (in particular, I don't see why it should mean that the document should be understood as XHTML 1.0 Strict).

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 1, 2016

@a3nm thanks for looking that up... and yes, I understand the "summary" attribute on the "table" element is not mandatory in XHTML 1.0 strict... but it is allowed, and that does mean tidy will emit the warning...

This warning was added by the original founders of tidy. They indicated it was there as an aid to the visually impaired, to descibe the table, like an accessibility warning. Over the early years I too argued against it, but did not get anywhere. And now that I am in that position of choice, am not yet convinced to suppress it.

Concerning the meta charset tag, it was introduced in this form only in html5, previously as a meta content attribute. During the parsing tidy thus sees that new form constraining what the document type may be to HTML5 type, hence chooses HTML5 or XHTML5, and acts accordingly.

You will note tidy does not warn about its omission, but perhaps should, like the W3C validator does. Following your links to the-meta-element I can read -

Exactly one of the name, http-equiv, and charset, attributes must be specified.
and
There must not be more than one meta element with a charset attribute per document.

But that is a different question, as a separate issue.

So concerning the summary warning, you obviously vote for suppression. What do other think? Remove it? In what document types? Maybe add yet another option like --show-summary-warning no? Or leave as is?

I will try to listen to reason, but as stated am not yet convinced to suppress it for any document less than HTML5 and now XHTML5.

@a3nm
Copy link
Author

a3nm commented Mar 1, 2016

If the presence of an HTML5-specific element is good reason for tidy to consider the document as HTML5 (and emit HTML5-specific warnings), I wouldn't say that the absence of such elements should be legitimate cause for not considering as HTML5. I don't think tidy should emit warnings that would not be reasonable in HTML5 if the document can be seen as a valid HTML5 document. So I don't think the warning should be emitted on my original document, which reads fine as valid HTML5.

As for the omission of the meta-element, as the Content-Encoding HTTP header is a legitimate mechanism, I think it is totally legitimate to omit it and tidy is right not to warn about it.

That said, as a general comment, it would be nice to have a mechanism to toggle individual warnings from the command-line. If there were a clean way to ignore warnings that I dislike or don't care about, I would care less about tidy's default behavior.

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 2, 2016

@a3nm yes, while I agree the absense of a HTML5-specific element is not a good reason for not considering the document as HTML5, the present code decision logic allows tidy to choose XHTML 1.0 Strict in this case. So some review of the current code logic is perhaps warrented, but I personally am not yet convinced of this!

But please feel free to carefully examine this code logic, and present a patch, or a PR, showing a better way, without causing regressions when tidy is parsing obviously legacy documents... we do always want improvement in tidy code! ;=))

And at this time also agree, the omission of a content encoding header should not be a tidy warning, but just wish the W3C validator would agree also ;=)) but this is a separate subject!

And it is also agreed, as a general comment, that having a myriad of command line options to toggle warnings would perhaps be a good thing!?

I am immediately reminded of the gcc options, where for just about every warning, there is a no- option, like -Wno-unused-variable, etc, etc, etc. Again, feel free to implement this in a fork, and present a PR. It will certainly be considered.

BTW I will shortly be pushing a new option to tidy, --escape-scripts no, see #348, and #65 before it, where I will be including a new README/OPTIONS.md detailing the simple 4 step process of adding a new option to tidy. Maybe this would help...

Unfortuately, someone has to care about tidy's behaviour ;=))

@balthisar
Copy link
Member

I wonder if suppressing faults via a myriad of configuration options would be the best approach. This adds additional complexity to Tidy, of course, but it also seems that Tidy already provides a facility to filter unwanted messages: TidyReportFilter.

There's probably a way to take advantage of this in the console application without having to build this into the library per se. (Actually it's not in the latest API documentation, but currently in master there's a better TidyReportFilter3.)

Even in console this opens up a bigger question. Do we allow any error to be suppressed? Is this dangerous? Do we have a new command line switch for every possible error? Do we force it into another config file? Add it to the current config file?

Right now there are 231 possible warnings/errors/info that can be output.

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 2, 2016

@balthisar thank you for your comment... and separating it into two clear issues...

  1. What libtidy reports, and
  2. What our console app, tidy, reports

On the first libtidy should continue to report ALL warnings it sees during the parsing of the document. This should not change! It is correct that libtidy, if it sees the document as a VERS_HTML5 (HT50|XH50) doc it should not emit this table summary warning, and that has been fixed. And vice versa...

Yes, others apps using libtidy have the choice to add various filters for this output... their choice...

On the other hand, our sort of sample app, console tidy, should not have such filters. It should report what libtidy offers. Adding some 231 additional options to it would seem very wrong... even as a separate filter config...

Now libtidy, from 5.1.44 onward, has been fixed to not report a warning if it finds the document as a VERS_HTML5 type, and to continue to report otherwise...

So to me the original request has been met, and I am re-setting this as a Feature Request (FR), with an indefinite future (IF), until further comments convince me it is otherwise...

Be aware, such FR-IF will be closed in the absence of further comments after a reasonable time... So please speak up...

As always, thanks for your comments...

@geoffmcl geoffmcl modified the milestones: Indefinite future, 5.2 Mar 2, 2016
@balthisar
Copy link
Member

I'm in alignment with @geoffmcl's comments. If no further discussion on the matter then we'll close this issue in the near future.

@a3nm
Copy link
Author

a3nm commented Mar 3, 2016

The discussion has drifted, but I think the original bug that I was reporting has not yet been satisfactorily fixed: the document in my original report is, as far as I can tell, perfectly valid HTML5, and it still triggers a spurious warning when fed to tidy, even with the latest fix.

I understand that the reason is that the document is mistakenly interpreted as XHTML 1.0 where it might make sense to advise in favor of that attribute. Nevertheless I think it is a bug that tidy emits a warning on a document if the document is valid when interpreted as HTML5.

(As for the more general question of disabling warnings via command-line flags, as I user I think it would be useful, and not unreasonable, following gcc's example. Using flags from the shell is more convenient and modular than having to write code. But I agree that this is a more general question, certainly a feature request rather than a bug, and not related to my original issue.)

@balthisar
Copy link
Member

I understand that the reason is that the document is mistakenly interpreted as XHTML 1.0 where it might make sense to advise in favor of that attribute. Nevertheless I think it is a bug that tidy emits a warning on a document if the document is valid when interpreted as HTML5.

Hmm... maybe this is a valid point if we use --doctype=html5 instead of the default auto, i.e., Tidy probably should respect a specifically declared doctype. In your original bug you're not specifying a doctype. Would your application allow you to specify the doctype? In any case, right now, specifying the doctype won't achieve the behavior you want, but it's still something to consider.

With auto I think that the explanation @geoffmcl provided is still the correct behavior for determining the doctype.

Pinging @geoffmcl: what's your opinion on this? Respect the chosen doctype or not? There's likely to be other output affected...

In any case I probably won't be able to have a close look at it until this weekend. And @a3nm, you're welcome to submit a patch!

@a3nm
Copy link
Author

a3nm commented Mar 3, 2016

I agree that at least the behavior should be fixed for --doctype=html5 (which, for my use case, I could specify).

For automatic detection, I'm not sure I follow the explanation of @geoffmcl. Again, I find it weird that tidy would choose a doctype, think that it found a warning, and blame the user, whereas it should have given the user the benefit of doubt: tidy should not complain about an input document if there is some doctype for which the document is valid, in my opinion.

I won't have time to submit a patch myself, I just want to point out the existence of the bug.

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 3, 2016

@balthisar yes, I had thought of --doctype html5 could influence this case, but found at this time it does not change anything. But agree maybe it should/could be taken into account...

@a3nm it is not so much automatic detection, but a process of elimination as the document is parsed. We have about 20+ doctype bits defined, from HT20 to XH50, and many combinations, and each document starts with all bits enabled, in doc->lexer->versions. And then for each element, attribute we have a big tag-defs table of what doctypes that item is allowed in, supports.

So as certain elements are found in the parsing, some bits are eliminated. The service is called void TY_(ConstrainVersion)( TidyDocImpl* doc, uint vers );. It is called many times during the parsing phase...

Not yet checked exactly in code, but assume for example your <html dir="ltr" xml:lang="en" lang="en"> kicks off all the non-xhtml bits. And likewise if a purely html5 element is found, like <meta charset="utf-8">, all the non-html5 bits are eliminated, hence at the end only XH50 bit remains, so tidy guesses XHTM5.

But without that meta tag, or some other HTML5 only element, I guess XHTML 1.0 strict is also still in the bits remaining, and tidy chooses this, over XHTML5, which I think would also be there. While this choice could be altered I can hear others users suggesting I prefer strict, if it is available... It is not a question of blaming or benefit of doubt but choosing exactly what to report.

But there is light at the end of the tunnel ;=))

  1. I too think the --doctype [html5|omit|auto|strict|transitional|user] should play a role.
  2. For a while now I also think the doctype on the document should also be taken into account.

By that I mean, what I shall call short form <!DOCTYPE html>, and that would include the absense of it, when tidy assumes html5 mode, maybe should also constrain the version bits to VERS_HTML5? But this is not so sure, since it appears it can also be on XHTML 1.1 strict, which is why I have left it as is...

But one or both these could push tidy to choose only XHTML5.

I will find time to look more closely at this, but as always patches, or a PR would be appreciated... thanks...

@geoffmcl geoffmcl modified the milestones: 5.2, Indefinite future Mar 3, 2016
@a3nm
Copy link
Author

a3nm commented Mar 3, 2016

Thanks a lot for your clarifications.

assume for example your kicks off all the non-xhtml bits

I think this would be a problem. As discussed earlier, this form of the HTML tag is perfectly valid in an HTML 5 document.

But without that meta tag, or some other HTML5 only element, I guess XHTML 1.0 strict is also still in the bits remaining, and tidy chooses this, over XHTML5, which I think would also be there. While this choice could be altered I can hear others users suggesting I prefer strict, if it is available...

OK, I understand. Intuitively I would have expected that tidy by default would choose whatever doctype (if any) causes it to report no warning or errors. I would say that people who want to validate against a specific doctype to get more warnings could specify it with --doctype.

That said, this problem of which doctype to prefer is fuzzier. In any case, the warning in question should not be triggered with --doctype html5. I just checked and it is indeed triggered even with that option.

Thanks again for clarifying.

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 3, 2016

@a3nm as mentioned that comment about kicking off non-xhtml bits was done without actually doing any debug, as is wrong, but the general idea of contraining remains valid...

I added some more debug output, and first observing the dimishing bits, with the meta charset tag - input5\in_377.html

Exit ParseHead 1...
R=7 C=1: Returning starttag node <body> stream
Before: HT20|HT32|H40S|H40T|H40F|H41S|H41T|H41F|X10S|X10T|X10F|XH11|XB10|----|HT50|XH50
After : HT20|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
Enter ParseBody...
Before: HT20|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
After : ----|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
Exit ParseHTML 2...
Before: ----|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
After : ----|----|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|----|----|HT50|XH50
Before: ----|----|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|----|----|HT50|XH50
After : ----|----|----|----|----|----|----|----|X10S|X10T|----|XH11|----|----|----|XH50
Before: ----|----|----|----|----|----|----|----|X10S|X10T|----|XH11|----|----|----|XH50
After : ----|----|----|----|----|----|----|----|X10S|X10T|----|----|----|----|----|XH50
Before: ----|----|----|----|----|----|----|----|X10S|X10T|----|----|----|----|----|XH50
After : ----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|XH50

As you can see there was no change effected by the html element contrary to what I suggested. The fact that the document has a body, immediately removed all frameset bits after exit ParseHead. Then something else remove HT20, and no more until after exit ParseHTML, the end of the document parsing...

Now we had four more constraints, the last, presumed the meta tag, kicked of everything except XH50, so we have XHTML5.

Now the sequences when there is not a meta charset tag - input5\in_377-3.html

Exit ParseHead 1...
R=6 C=1: Returning starttag node <body> stream
Before: HT20|HT32|H40S|H40T|H40F|H41S|H41T|H41F|X10S|X10T|X10F|XH11|XB10|----|HT50|XH50
After : HT20|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
Enter ParseBody...
Before: HT20|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
After : ----|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
R=7 C=3: Returning starttag node <table> stream
Exit ParseHTML 2...
Before: ----|HT32|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|XB10|----|HT50|XH50
After : ----|----|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|----|----|HT50|XH50
Before: ----|----|H40S|H40T|----|H41S|H41T|----|X10S|X10T|----|XH11|----|----|HT50|XH50
After : ----|----|----|----|----|----|----|----|X10S|X10T|----|XH11|----|----|----|XH50
Before: ----|----|----|----|----|----|----|----|X10S|X10T|----|XH11|----|----|----|XH50
After : ----|----|----|----|----|----|----|----|X10S|X10T|----|----|----|----|----|XH50

Naturally, largely the same, except for what I presume was the last meta kicker, so this time tidy has 3 bits to choose from, and here, as we know, it chose X10T (XHTML 1.1. strict).

Now as pointed out in commit 7bdc31a to at least not emit the warning for VERS_HTML5, one of the services involved is HTMLVersion, which is a bit comlicated to read, but it seems by the time this is called when checking the table element, the choice, if we can call it that, has been made.

Perhaps I even need more debug to show exactly the reasons for each constraint... what element, attribute is the cause... Naturally I am only showing events where to call to the service actually caused some change...

Anyway, will continue digging, especially how to incude at least --doctype html5 in this choice...

As always ideas, discussion, patches, or a PR would be appreciated... thanks...

@geoffmcl
Copy link
Contributor

@a3nm, have found a patch which will return XHTML5 before XHTML1, in specific circumstances, but need to do some more testing -

diff --git a/src/lexer.c b/src/lexer.c
index 097a1bc..9f68cb9 100644
--- a/src/lexer.c
+++ b/src/lexer.c
@@ -261,6 +261,8 @@ int TY_(HTMLVersion)(TidyDocImpl* doc)
     if (dtver == VERS_UNKNOWN) return HT50;
     /* Issue #167 - if NOT XHTML, and doctype is default VERS_HTML5, then return HT50 */
     if (!xhtml && (dtver == VERS_HTML5)) return HT50;
+    /* Issue #377 - If xhtml and (doctype == html5) and constrained vers contains XH50 return that */
+    if (xhtml && ((vers & VERS_HTML5) == XH50) && (dtmode == TidyDoctypeHtml5)) return XH50;

     for (i = 0; W3C_Doctypes[i].name; ++i)
     {

Meantime, since we are considering a release 5.2 shortly, changing the milestone here to 5.3...

But if you can get a chance to add this patch and confirm it works, and if I get time for more testing then maybe it could be included in 5.2!

@geoffmcl geoffmcl modified the milestones: 5.3, 5.2 Mar 27, 2016
@a3nm
Copy link
Author

a3nm commented Mar 27, 2016

Thanks! I just tested it out, and it doesn't work, sorry (the warning
still occurs on the document that I provided in my original report).

However I tested that adding "return HT50;" just before the patch
silences the warning, so indeed it seems that the warning could be
silenced by editing this function.

Antoine Amarilli

@geoffmcl
Copy link
Contributor

@a3nm you did add --doctype html5 to your config?

That is one of the required circumstances... thanks for testing...

PS: Please try to remember to delete this text if you directly email reply... it just clutters up the issues if left...

@a3nm
Copy link
Author

a3nm commented Mar 27, 2016

Indeed, with "--doctype html5" the warning is suppressed with the patch,
whereas it occurs without the patch. Thanks!

I still stand by my opinion in the previous comments that the warning
shouldn't be emitted even without "--doctype html5", but it's already
much better to be able to silence the warning with this option.

Thanks again!

Antoine Amarilli

@geoffmcl
Copy link
Contributor

@a3nm your persistence is winning through ;=))

If we assume libtidy defaults to html5 mode, then should not the default TidyDoctypeAuto also mean html5 is preferred?

Accordingly, have expand the patch, and fixed another little niggle - MSVC10 prefers explicit Bool references - to :-

diff --git a/src/lexer.c b/src/lexer.c
index 097a1bc..3adf0bf 100644
--- a/src/lexer.c
+++ b/src/lexer.c
@@ -255,12 +255,18 @@ int TY_(HTMLVersion)(TidyDocImpl* doc)
     TidyDoctypeModes dtmode = (TidyDoctypeModes)cfg(doc, TidyDoctypeMode);
     Bool xhtml = (cfgBool(doc, TidyXmlOut) || doc->lexer->isvoyager) &&
                  !cfgBool(doc, TidyHtmlOut);
-    Bool html4 = dtmode == TidyDoctypeStrict || dtmode == TidyDoctypeLoose || VERS_FROM40 & dtver;
+    Bool html4 = ((dtmode == TidyDoctypeStrict) || (dtmode == TidyDoctypeLoose) ||
+                  (VERS_FROM40 & dtver) ? yes : no);
+    Bool html5 = (!html4 && ((dtmode == TidyDoctypeAuto) ||
+                  (dtmode == TidyDoctypeHtml5)) ? yes : no);

     if (xhtml && dtver == VERS_UNKNOWN) return XH50;
     if (dtver == VERS_UNKNOWN) return HT50;
     /* Issue #167 - if NOT XHTML, and doctype is default VERS_HTML5, then return HT50 */
     if (!xhtml && (dtver == VERS_HTML5)) return HT50;
+    /* Issue #377 - If xhtml and (doctype == html5) and constrained vers contains XH50 return that,
+       and really if tidy defaults to 'html5', then maybe 'auto' should also apply! */
+    if (xhtml && html5 && ((vers & VERS_HTML5) == XH50)) return XH50;

     for (i = 0; W3C_Doctypes[i].name; ++i)
     {
diff --git a/version.txt b/version.txt
index 7b3a3df..37cd9d6 100644
--- a/version.txt
+++ b/version.txt
@@ -1,2 +1,2 @@
-5.1.48
+5.1.48test
 2016.03.27

Then, tidy in default mode, ie no config, will not issue this summary warning!

Appreciate you, and others, testing more and commenting... this could make it into 5.2 if all positive feedback... thanks...

OT: Also love that if you add -lang fr you can read, perhaps bad?, french information ;=))

@a3nm
Copy link
Author

a3nm commented Mar 28, 2016

Hi,

I can confirm that this patch makes the warning disappear both with and
without "--doctype html5". Thanks!

Antoine Amarilli

@geoffmcl geoffmcl modified the milestones: 5.2, 5.3 Mar 30, 2016
geoffmcl added a commit that referenced this issue Mar 30, 2016
@geoffmcl
Copy link
Contributor

@a3nm I like this fix because it again demonstrates the default html5 nature of tidy, while still handling legacy documents...

Accordingly, have moved this back to a 5.2 milestone, and included it, bumping the version to 5.1.50...

Perhaps this can now be closed? thanks...

@a3nm
Copy link
Author

a3nm commented Mar 30, 2016

The latest master works for me, so as far as I'm concerned the issue can
be closed. Thanks for fixing!

Antoine Amarilli

@geoffmcl
Copy link
Contributor

geoffmcl commented Apr 7, 2016

@a3nm hope you get a chance to checkout the new release 5.2.0, or if you prefer, to stay with master then just pull the better 5.3.0 ;=))

@geoffmcl geoffmcl closed this as completed Apr 7, 2016
@a3nm
Copy link
Author

a3nm commented Apr 7, 2016

The current master is still OK for me w.r.t. this issue. Thanks again!

Antoine Amarilli

@NicHub
Copy link

NicHub commented Nov 6, 2020

Use the -xml flag or the -ashtml flag (depending on your need) to suppress the warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants