Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minification of escape sequences produces invalid documents #109

Closed
sevmeyer opened this issue Sep 2, 2022 · 4 comments
Closed

Minification of escape sequences produces invalid documents #109

sevmeyer opened this issue Sep 2, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@sevmeyer
Copy link

sevmeyer commented Sep 2, 2022

Python 3.10.5, minify-html 0.10.0 (pip install)

Hello. I might be doing it wrong, but I have encountered some issues with escape sequences, such as <. It seems that some of the minifications result in documents that are invalid according to https://validator.w3.org

Minimal valid input:

import minify_html
minified = minify_html.minify(
  '<!doctype html><html lang=en><meta charset=utf-8><title>T</title>&lt;',
  do_not_minify_doctype=True)

Output:

<!doctype html><html lang=en><meta charset=utf-8><title>T</title><

Error: End of file after <.

Another example:

<!doctype html><html lang=en><meta charset=utf-8><title>T</title>&lt;br&gt;

Output:

<!doctype html><html lang=en><meta charset=utf-8><title>T</title>&LTbr>

Error: Named character reference was not terminated by a semicolon. (Or & should have been escaped as &.)

In general, it would be nice if there was a configuration argument to preserve escape sequences.

@wilsonzlin
Copy link
Owner

This is intentional, and it's mentioned in the README that the minified output might not pass minification but still be correctly parsed by browsers. I might take a look into adding an option to prevent invalid entity representations.

@wilsonzlin wilsonzlin added the enhancement New feature or request label Jan 5, 2023
Repository owner deleted a comment from zow34tq Jan 5, 2023
@cubercsl
Copy link

cubercsl commented Jan 31, 2023

I have the same issue, and the minimal example:

<pre><code>#include&lt;cstdio&gt;</code></pre>

Output:
<pre><code>#include&LTcstdio></code></pre>

This caused the code to display unexpected content.

@wilsonzlin
Copy link
Owner

In version 0.16.0 (soon to be released), these entity minifications will no longer be done by default but can be enabled by allow_optimal_entities. If you test it out let me know if it works for you.

@Rongronggg9
Copy link

Hi @wilsonzlin
Sorry for disturbing. May I kindly ask when 0.16.0 will be released?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants