You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We historically used -utf8 with older versions of html2text, but the new version defaulted to UTF-8 by default, and does not accept -utf8 as command-line argument anymore.
% html2text -help
This is html2text, version 2.1.1
Usage:
html2text -help
html2text -version
html2text [ -check ] [ -debug-scanner ] [ -debug-parser ] \
[ -rcfile <file> ] [ -width <w> ] [ -nobs ] [ -links ]\
[ -from_encoding ] [ -to_encoding ] [ -ascii ]\
[ -o <file> ] [ <input-file> ] ...
Formats HTML document(s) read from <input-file> or STDIN and generates ASCII
text.
-help Print this text and exit
-version Print program version and copyright notice
-check Do syntax checking only
-debug-scanner Report parsed tokens on STDERR (debugging)
-debug-parser Report parser activity on STDERR (debugging)
-rcfile <file> Read <file> instead of "$HOME/.html2textrc"
-width <w> Optimize for screen widths other than 79
-nobs Do not render boldface and underlining (using backspaces)
-links Generate reference list with link targets
-from_encoding Treat input encoded as given encoding
-to_encoding Output using given encoding
-ascii Use plain ASCII for output instead of UTF-8
alias for: -to_encoding ASCII//TRANSLIT
-o <file> Redirect output into <file>
Old version help:
$ html2text -help
This is html2text, version 1.3.2a
Usage:
html2text -help
html2text -version
html2text [ -unparse | -check ] [ -debug-scanner ] [ -debug-parser ] \
[ -rcfile <file> ] [ -style ( compact | pretty ) ] [ -width <w> ] \
[ -o <file> ] [ -nobs ] [ -ascii | -utf8 ] [ <input-url> ] ...
Formats HTML document(s) read from <input-url> or STDIN and generates ASCII
text.
-help Print this text and exit
-version Print program version and copyright notice
-unparse Generate HTML instead of ASCII output
-check Do syntax checking only
-debug-scanner Report parsed tokens on STDERR (debugging)
-debug-parser Report parser activity on STDERR (debugging)
-rcfile <file> Read <file> instead of "$HOME/.html2textrc"
-style compact Create a "compact" output format (default)
-style pretty Insert some vertical space for nicer output
-width <w> Optimize for screen widths other than 79
-o <file> Redirect output into <file>
-nobs Do not use backspaces for boldface and underlining
-ascii Use plain ASCII for output instead of ISO-8859-1
-utf8 Assume both terminal and input stream are in UTF-8 mode
-nometa Don't try to recode input using 'meta' tag
It might have been nice to keep supporting -utf8 (maybe even unlisted in the -help output) as a no-op (as the default is UTF-8) so that existing scripts using html2text can work with both versions.
For now, I worked around this by first feature-checking -utf8 via -help's output and then either adding it or leaving it out.
The text was updated successfully, but these errors were encountered:
dexterbg
added a commit
to dexterbg/html2text
that referenced
this issue
Aug 29, 2022
We historically used
-utf8
with older versions of html2text, but the new version defaulted to UTF-8 by default, and does not accept-utf8
as command-line argument anymore.thp/urlwatch#718
Version 2.1.1 help output:
Old version help:
It might have been nice to keep supporting
-utf8
(maybe even unlisted in the-help
output) as a no-op (as the default is UTF-8) so that existing scripts usinghtml2text
can work with both versions.For now, I worked around this by first feature-checking
-utf8
via-help
's output and then either adding it or leaving it out.The text was updated successfully, but these errors were encountered: