-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Directory listing chokes on non-ASCII chars #223
Comments
Thanks! The environment for the notebook server spawned by JupyterHub could easily be the problem. It appears that Within the world of the notebook server, the Python 3 interpreter (Ubuntu 14.04.1 LTS Strangely I cannot reproduce this with Python 3 from Anaconda on my Mac, even if I deliberately give it |
I think that makes sense - it's using the locale encoding to convert filenames to bytes. I think it only comes up on Linux, because OS X and Windows define how filenames are encoded independent of the locale. Jupyterhub should probably ensure that the single-user server is started with appropriate locale settings so that it treats the filesystem as UTF-8. The notebook server should fail more gracefully when there's a unicode problem with a filename. I'll have a look at that. |
@nkeim what version of JupyterHub? The Hub passes its LANG, LC env variables to the notebook children by default. Do you have any custom config? |
Deferring to @ellisonbg for the JupyterHub version — he maintains it here. |
@nkeim in that case, I know exactly what the problem is. The launch config (in supervisord) is not setting the LANG environment variable. This is in
to ensure that server processes get the right env. |
@minrk Great, thanks! |
Part of the fix for jupytergh-223. If a filename can't be decoded in the current encoding, Python escapes the undecodable bytes as unpaired surrogates, which JS doesn't like building a URL from. This doesn't make the undecodable filename openable, but it stops it from breaking the listing of other files. The real fix is to set up the locale encoding correctly so that the filenames can be decoded.
#229 should stop it from breaking the rest of the list in such cases. |
In a directory listing, a non-ASCII character in a filename (including notebook names) abruptly terminates the listing, making all files below it invisible.
Steps to reproduce are in the linked gist, though if you are a Mac user you can just type Option-m in the Terminal to put a µ in a filename, then check the Jupyter directory listing. The character in the linked example is "µ" (micro sign), but it appears that any byte outside
range(128)
will do it.https://gist.github.com/nkeim/5798b1211d52ed47993b
This has to be run with Python 2; Python 3 insists on ASCII for the filename.
Note that the IPython version is just "3.2.0-dev". @ellisonbg may be able to tell you the precise commit, if that matters.
Background: I ended up with Unicode characters in my notebook names courtesy of IPython 2.x, which seemed to handle them perfectly.
The text was updated successfully, but these errors were encountered: