Skip to content

Default script tag to JavaScript for pretty printing if no type parameter is present (which is valid HTML5), so that CDATA is going to be commented out. #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

daaain
Copy link

@daaain daaain commented Mar 14, 2012

Hey,

I have run into a quite worrying bug, seems that if there's a <script> tag without a type parameter – which is perfectly valid in HTML5 – Tidy will enclose the contents into an uncommented CDATA :(

Here's a very simple example which reproduces the error for me with Tidy compiled on Max OS X Lion 10.7.3 using the latest (1st March 2012 – 9412ef6) commit.

Using...

tidy --write-back yes test.html

...on this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-GB" lang="en-GB">


<head>

</head>

<body class=" x ">
<script>
    try { top.document.domain } catch (e) {
        var f = function() { document.body.innerHTML = ''; }
        setInterval(f,1);
        if (document.body) document.body.unload = f;
    }
</script>



<div id="skip-links">
    <p class="skip-link-p">
        Skip to: 
        <a accesskey="1" class="skip-link" href="#content">content</a>, 
        <a accesskey="2" class="skip-link" href="#nav-links">navigation</a>
    </p>
</div>


</body>
</html>

...results in this:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-GB" lang=
"en-GB">
<head>
<meta name="generator" content=
"HTML Tidy for Mac OS X https://github.com/w3c/tidy-html5/tree/f2825b9" />
<title></title>
</head>
<body class=" x">
<script>
<![CDATA[
        try { top.document.domain } catch (e) {
                var f = function() { document.body.innerHTML = ''; }
                setInterval(f,1);
                if (document.body) document.body.unload = f;
        }
]]>
</script>
<div id="skip-links">
<p class="skip-link-p">Skip to: <a accesskey="1" class="skip-link"
href="#content">content</a>, <a accesskey="2" class="skip-link"
href="#nav-links">navigation</a></p>
</div>
</body>
</html>

…eter is present (which is valid HTML5), so that CDATA is going to be commented out.
@daaain
Copy link
Author

daaain commented Mar 14, 2012

I have no experience in C (so please be gentle), but the changes in the pull request seem to have fixed the issue for me!

I tried to put || !AttrHasValue(type) into the first if as that seemed like a cleaner solution, but for some reason that threw an error.

@sideshowbarker
Copy link
Contributor

If you serve your example as text/html, you should use the -ashtml option to have it output as real HTML instead of XHTML.

As far as I understand what you reported is expected behavior because you have an XHTML doctype, so tidy thinks you actually want it handled as XHTML, not HTML. So tidy outputs the right thing for XHTML, which is to wrap the script contents in CDATA to allow for if the JS contains and < and & characters and you serve the document as application/xhtml+xml so that when browsers use their XML parsers to parse it they don't choke on the < and & characters.

I think changing the current behavior for XHTML output to having those CDATA delimiters not output (or commented out) is not what most users would want.

Maybe we can add a -aspolyglot option for the case where you want to author your content as XHTML but serve it as text/html. For that case, the current XHTML output codepath would be followed but parts of it like the CDATA thing would need to have an additional condition to check for the polyglot option.

@daaain
Copy link
Author

daaain commented Mar 14, 2012

Can't find -ashtml option, but I presume you meant --output-html as this seems to work like that, omitting the CDATA wrapper entirely – not sure how I missed it!

This solves my problem so I'm closing the pull request which is the wrong solution to the wrong problem :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants