-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
html_error_template doesn't handle non-ascii characters in source code #37
Comments
Anonymous wrote: Er, this is pjenvey btw |
Michael Bayer (@zzzeek) wrote: with regards to the patch, why would we go through all that trouble to parse the magic encoding comment when we are receiving a CompileException or SyntaxException ? in both cases, we have already lexed the template and we have already converted the source to unicode. this encoding could be passed along to CompileException and SyntaxException. also "parser" is not defined, and the whole BOM thing if we want to support that (which im not thrilled about, but I can be swayed) should just be part of Lexer. the second part of the question, putting encoding_errors as a parameter to Template/TemplateLookup, i would say its not a big deal, the only reason its not there is because there already is a way to convert to unicode explicitly (more than one way to do it...but not a big deal) |
Michael Bayer (@zzzeek) wrote: ok so in [changeset:259] im propigating the source of the original template as a unicode object to the |
Changes by Michael Bayer (@zzzeek):
|
Migrated issue, originally created by Anonymous
Max Ischenko reported this here:
http://groups.google.com/group/pylons-discuss/browse_frm/thread/22e2af6ba7bbf778
I've attached a patch to convert the source code to unicode via the source code's magic encoding comment (like Myghty does -- in fact I pretty much ripped all of dairiki's code for this). In the rare case that the magic encoding comment doesn't describe the non-ascii characters correctly, the 'replace' error handler is used to add garbage in its place
But before we commit it, I think I would like to see an addition to the Template class.
Pylons currently does this in its exception handler to render the exception:
With this patch, and Max's example, Pylons now causes a UnicodeEncodeError here:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 4-9: ordinal not in range(128)
What's needed for Pylons to render the html_error_template correctly (including the non-ascii characters showing up in the browser), it now has to do this:
Mike I may have asked you this before, but is there a reason why mako Templates don't have an 'encoding_errors' parameter to go along with 'output_encoding'?
If it did, we could have html_error_template set encoding_errors='htmlentitiyreplace'. Then anyone needing an html_error_template wouldn't have to be concerned with unicode at all -- they could simply render() (like Pylons is doing now) and always get back a sys.getdefaultencoding() encoded string with html entitys for non-defaultencoding characters. That is what Myghty did:
http://www.myghty.org/trac/changeset/2120
http://www.myghty.org/trac/changeset/2121
I remember arguing this implementation a little with dairiki when he did this, but it seemed to work out pretty well
Attachments: html_error_source_encoding.diff
The text was updated successfully, but these errors were encountered: