Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large notebook fails to save #650

Closed
glider-gun opened this issue Oct 24, 2015 · 27 comments
Closed

Large notebook fails to save #650

glider-gun opened this issue Oct 24, 2015 · 27 comments

Comments

@glider-gun
Copy link

When I was using IPython notebook to analyze our experiment data, I noticed I could not save notebook.
The console (from which I started ipython notebook) stated:

[I 17:37:11.736 NotebookApp] Malformed HTTP message from ::1: Content-Length too long

So I guess this problem comes from notebook size.
I was using "bokeh" library to plot my data, and the notebook file was about 100 MB on disk.

To reproduce, I prepared a new notebook and did many plot to produce a large-filesize notebook.

2015-10-24 17 30 24

This does 30001-point plot repeatedly (e.g. 100 plots in above screen shot,) I could not save the notebook above: when I repeated saving with increasing number of plots, again above about 100MB, I could not save the notebook (with the same console message).

In a little more detail, I could save notebook until 88 plots, when notebook file size was 104756892 byte or 99.904 MB. And I could not save it with 89 plots. By increasing number of plots, file size increased about 1.1 MB per one plot.

I searched issue list, but could not find about this.
Is this limit intentional? Is there some work around for this problem (without removing cells from notebook)?


My environments are:

$ python -c "import IPython; print(IPython.sys_info())"
{'commit_hash': u'2d95975',
 'commit_source': 'installation',
 'default_encoding': 'UTF-8',
 'ipython_path': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython',
 'ipython_version': '3.2.1',
 'os_name': 'posix',
 'platform': 'Darwin-13.4.0-x86_64-i386-64bit',
 'sys_executable': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python',
 'sys_platform': 'darwin',
 'sys_version': '2.7.10 (default, Aug 26 2015, 18:15:57) \n[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]'}

OS: Mac (OSX, 10.9.5 Mavericks)
Browser: Safari 9.0 (9537.86.1.56.2)
matplotlib python library ver. 1.4.3
numpy python library ver. 1.9.2
bokeh python library ver. 0.10.0

@willingc
Copy link
Member

Hi @glider-gun. Thank you for the detailed issue report. The details are really helpful to our developers.

I believe that the scenario and error message that you are seeing is hitting the default max_body_size limitation of Tornado (which is a dependency of Jupyter notebook) https://github.com/tornadoweb/tornado/blob/eaf34865a63460cdd64abd1ae2c8835b174c6e93/tornado/http1connection.py#L537

@glider-gun I don't know if using Python 3 and a more recent version of IPython would have the same limitation. If you are able to test easily, please do. If not, no worries.

@minrk @Carreau Is there a way to workaround the default max_body_size limit by chunking the body (https://github.com/tornadoweb/tornado/blob/eaf34865a63460cdd64abd1ae2c8835b174c6e93/tornado/http1connection.py#L346) or setting a different body_size limit (https://github.com/tornadoweb/tornado/blob/eaf34865a63460cdd64abd1ae2c8835b174c6e93/tornado/http1connection.py#L324)?

@Carreau
Copy link
Member

Carreau commented Oct 24, 2015

We might want to look at this problem while working on #536

@glider-gun
Copy link
Author

Thank you for quick reply!
I tried the same thing with IPython 4.0.0 for both python 2.7.10/3.4.3.
And similarly, I could save until with 88 plots and not with 89 plots. File size for the notebooks were similar.


environments for python 2.7.10 with IPython 4.0.0:

$ python -c "import IPython; print(IPython.sys_info())"
{'commit_hash': u'f534027',
 'commit_source': 'installation',
 'default_encoding': 'UTF-8',
 'ipython_path': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython',
 'ipython_version': '4.0.0',
 'os_name': 'posix',
 'platform': 'Darwin-13.4.0-x86_64-i386-64bit',
 'sys_executable': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python',
 'sys_platform': 'darwin',
 'sys_version': '2.7.10 (default, Aug 26 2015, 18:15:57) \n[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]'}

OS, browser, all library versions are same as my first comment


environments for python 3.4.3 with IPython 4.0.0:

$ python -c "import IPython; print(IPython.sys_info())"
{'commit_hash': 'f534027',
 'commit_source': 'installation',
 'default_encoding': 'UTF-8',
 'ipython_path': '/Users/glidergun/.pyenv/versions/miniconda3-3.16.0/lib/python3.4/site-packages/IPython',
 'ipython_version': '4.0.0',
 'os_name': 'posix',
 'platform': 'Darwin-13.4.0-x86_64-i386-64bit',
 'sys_executable': '/Users/glidergun/.pyenv/versions/miniconda3-3.16.0/bin/python',
 'sys_platform': 'darwin',
 'sys_version': '3.4.3 |Continuum Analytics, Inc.| (default, Oct 20 2015, '
                '14:27:51) \n'
                '[GCC 4.2.1 (Apple Inc. build 5577)]'}

(the user name part in this quote is replaced by hand)
OS, browser, all library version are same except numpy was 1.10.1

@willingc
Copy link
Member

@glider-gun Thanks for the additional info. For now, I recommend keeping an eye on #536 as suggested by @Carreau.

In the interim, I wonder if saving more frequently would be a reasonable workaround to the limitation.

@glider-gun
Copy link
Author

I see, thank you.
I'm afraid it wouldn't workaround after I noticed I exceeded the limitation, though it would save more cells from browser crash or so in that situation.

@takluyver
Copy link
Member

It looks like tornado imposes a maximum size of 100MB for HTTP requests by default, and I don't think we currently override that anywhere:
https://github.com/tornadoweb/tornado/blob/a97ec9569b1995d8aa3da0a7f499510bffc006a3/tornado/iostream.py#L154

In the long run, the fix will be to maintain notebook models on the server, so we don't have to send the whole notebook over HTTP at once. But we should probably increase that limit as an interim measure.

@michaelmesser
Copy link

Any solution? I have a large notebook that I want to save.

@Carreau Carreau modified the milestones: 4.4, 4.3 Jul 29, 2016
@rpep
Copy link

rpep commented Aug 12, 2016

@davidcortesortuno and I are also having this problem with Holoviews HoloMaps, where it's quite easy to go over 100mb.

@davidcortesortuno
Copy link

We temporarily fixed this by modifying the tornado/iostream.py file, as suggested before by @takluyver . For example, by doing self.max_buffer_size = 1048576000

@horta
Copy link

horta commented Dec 8, 2016

Having the same problem here...

@takluyver takluyver modified the milestones: 4.4, 5.0 Feb 2, 2017
@takluyver
Copy link
Member

Shall we bump the limit up for 5.0? Does anyone have a guide as to what a sensible limit might be?

@gnestor
Copy link
Contributor

gnestor commented Feb 4, 2017

You should be able to set a larger max_body_size or max_buffer_size by providing the torando_settings flag or by setting it in jupyter_notebook_config.py:

jupyter notebook --NotebookApp.tornado_settings="{'max_body_size': 104857600, 'max_buffer_size': 104857600}"

I don't have a good notebook to test with, but the rationale:

@takluyver takluyver removed this from the 5.0 milestone Feb 6, 2017
@SamuelMarks
Copy link

@gnestor That didn't help me, I'm still getting a Request Entity Too Large message.

Oh, also I'm doing this over HTTPS if that makes a difference.

@gnestor
Copy link
Contributor

gnestor commented Mar 29, 2017

@SamuelMarks Can you try upgrading to notebook 5.0.0rc2 (pip install notebook --force-reinstall --no-deps --ignore-installed --pre) and see if the new rate limits help?

@SamuelMarks
Copy link

SamuelMarks commented Mar 29, 2017

@gnestor Weird, can't get it to work at all now.

Even tried in a new virtualenv:

$ pip install --pre jupyter[all] notebook

But still getting:

$ jupyter notebook
Error executing Jupyter command 'notebook': [Errno 2] No such file or directory

Edit: wait am I meant to use python3 -m notebook now instead?

@gnestor
Copy link
Contributor

gnestor commented Mar 29, 2017

Did you try just pip install --pre notebook?

@SamuelMarks
Copy link

@gnestor - Okay, got it to work with latest --pre of notebook.

Same Request Entity Too Large error.

@takluyver
Copy link
Member

We have increased the default limit in 5.0 (#2139), and it is possible to configure a still larger size.

@gnestor, you marked this for 5.1 - what do you want to do? Bump up the default limit still further? Make it easier to configure the limit?

@gnestor gnestor modified the milestones: Reference, 5.1 Jul 3, 2017
@gnestor
Copy link
Contributor

gnestor commented Jul 3, 2017

@takluyver I think these default limits should suffice for now. Let's close this and for reference, if any users are encountering this issue (not being able to save a notebook due to file size), you can increase the limit by editing these lines: https://github.com/jupyter/notebook/blob/master/notebook/notebookapp.py#L237-L238

@j-andrews7
Copy link

For anyone else finding this before #3829 is actually merged, the only solution in this thread that actually currently works in notebook 5.6.0 is to modify the tornado/iostream.py file, as suggested before by @takluyver and @davidcortesortuno. For example, by setting self.max_buffer_size = 1048576000 (line 238).

Trying to pass the arguments to jupyter notebook when starting it doesn't work, nor does editing the notebookapp.py files.

@rawmarshmellows
Copy link

rawmarshmellows commented Aug 22, 2018

@gnestor I've encountered a similar bug. Initially, I got the error of Request Entity Too Large though and after modifying streamio.py it worked fine for a while I was able to continue working on the notebook and saving it (even though my notebook size was over 100MB). As the notebook grew bigger (through the plotting of images) and I try to save the notebook, my browser simply crashes without logging anything in the console.

After much debugging, I believe that this is being triggered by the large amounts of images that I am saving. Once the number of images in notebook goes above a certain threshold everything shuts down without warning. Any ideas?

@j-andrews7
Copy link

j-andrews7 commented Aug 22, 2018

@kevinlu1211 Yeah, I basically have the same issue, but I think it's just due to the browser running out of memory, as chrome only allocates up to 1.8 GB of memory per tab by default. Watch the memory usage when it runs, if it dies after growing to about that size, that's probably your problem. Fortunately, you can adjust this as described here, which has thus far fixed my issue, though I suspect I will hit it again if the tab reaches ~3.5 GB.

@rawmarshmellows
Copy link

@j-andrews7 I don't think it was my browser reaching the memory limit, but regardless I did the fix though it still didn't any other ideas?

@j-andrews7
Copy link

@kevinlu1211 nope, sorry mate. Maybe try a different browser?

@rpep
Copy link

rpep commented Aug 23, 2018

You could maybe try:

%matplotlib inline
plt.figure(dpi=70)

to reduce the resolution of the images?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests