Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote Jupyter connection with long running command: Extension host terminated due to memory #3324

Closed
simplelife2010 opened this issue Jun 18, 2019 · 9 comments
Assignees

Comments

@simplelife2010
Copy link

Environment data

  • VS Code version: 1.35.1
  • Extension version (available under the Extensions sidebar): 2019.5.18875
  • OS and version: macOS Mojave 10.14.5
  • Python version (& distribution if applicable, e.g. Anaconda): 3.6.8 (Anaconda)
  • Type of virtual environment used (N/A | venv | virtualenv | conda | ...): conda
  • Relevant/affected Python packages and their versions: JupyterLab 0.35.6
  • Jedi or Language Server? (i.e. what is "python.jediEnabled" set to; more info How to update the language server to the latest stable version vscode-python#3977): True

Expected behaviour

When connecting to a remote Jupyter server, I expect the VS Code extension host not to terminate unexpectedly, even when I run long running commands remotely (> 50 minutes), for example training machine learning models using Tensorflow.

Actual behaviour

When connecting to a remote Jupyter server and executing long running (> 50 minutes) commands like model.fit(), after approx. 50 minutes I receive a popup 'Extension host terminated unexpectedly' with the option to restart the extension host. The connection to the Jupyter session is lost and I receive no further output from the long running command, even if I restart the extension host. I cannot reconnect to the kernel which seems to be still running (as I can see on the server). I also cannot interrupt or restart this kernel, as the corresponding buttons show no reaction. To continue working I have to kill the running kernel on the Jupyter server. When I have done this and try to start another Jupyter session from VS Code, I receive the message 'Cannot execute code, session has been disposed.'

Steps to reproduce:

  1. Setup a python file with ipython cells.
  2. Configure an external Jupyter server (specify Jupyter server URI)
  3. Execute a command via IPython running longer than 50 minutes.

Logs

No output in the output panel

@simplelife2010
Copy link
Author

The output from developer tools console:

[Extension Host] Info Python Extension: 2019-06-18 18:50:18: Cached data exists getEnvironmentVariables, /
workbench.main.js:3311 Extension Host
workbench.main.js:3311 FATAL ERROR: Scavenger: semi-space copy
 Allocation failed - process out of memory
 1: node::Abort() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 2: node::FatalError(char const*, char const*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 3: v8::internal::FatalProcessOutOfMemory(char const*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 4: v8::internal::FatalProcessOutOfMemory(char const*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 5: v8::internal::FatalProcessOutOfMemory(char const*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 6: v8::internal::ScavengeJob::ScheduleIdleTaskIfNeeded(v8::internal::Heap*, int) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 7: v8::Testing::DeoptimizeAll(v8::Isolate*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 8: v8::internal::Heap::RootIsImmortalImmovable(int) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 9: v8::internal::Heap::CreateFillerObjectAt(unsigned char*, int, v8::internal::ClearRecordedSlots) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
10: v8::internal::Heap::CreateFillerObjectAt(unsigned char*, int, v8::internal::ClearRecordedSlots) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
11: v8::internal::PagedSpaces::next() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
12: v8::internal::Factory::NewRawTwoByteString(int, v8::internal::PretenureFlag) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
13: v8::internal::SourcePositionTableIterator::SourcePositionTableIterator(v8::internal::Handle<v8::internal::ByteArray>) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
14: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
15: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
16: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
17: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
18: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
19: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
20: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
21: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
22: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
23: v8::internal::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, v8::internal::Runtime::FunctionId) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
24: 0x2e556508525d
25: 0x2e556508cb43
26: 0x2e55653357b8

<--- Last few GCs --->

[10840:0x7fe905014a00]  2997386 ms: Mark-sweep 1835.1 (2083.7) -> 1835.1 (2083.7) MB, 26.7 / 0.0 ms  allocation failure GC in old space requested
[10840:0x7fe905014a00]  2997427 ms: Mark-sweep 1835.1 (2083.7) -> 1835.1 (2076.7) MB, 35.0 / 0.0 ms  last resort GC in old space requested
[10840:0x7fe905014a00]  2997463 ms: Mark-sweep 1835.1 (2076.7) -> 1835.1 (2076.7) MB, 35.6 / 0.0 ms  last resort GC in old space requested


<--- JS stacktrace --->
Cannot get stack trace in GC.

workbench.main.js:3183 Extension host terminated unexpectedly. Code:  null  Signal:  SIGABRT
_onExtensionHostCrashed @ workbench.main.js:3183
_onExtensionHostCrashed @ workbench.main.js:3890
_onExtensionHostCrashOrExit @ workbench.main.js:3183
e.onDidExit @ workbench.main.js:3183
fire @ workbench.main.js:77
_onExtHostProcessExit @ workbench.main.js:3316
_extensionHostProcess.on @ workbench.main.js:3311
emit @ events.js:182
ChildProcess._handle.onexit @ internal/child_process.js:237
workbench.main.js:2379 Extension host terminated unexpectedly.
onDidNotificationChange @ workbench.main.js:2379
_register.model.onDidNotificationChange.e @ workbench.main.js:2379
fire @ workbench.main.js:77
notify @ workbench.main.js:2435
notify @ workbench.main.js:3660
prompt @ workbench.main.js:3660
_onExtensionHostCrashed @ workbench.main.js:3891
_onExtensionHostCrashOrExit @ workbench.main.js:3183
e.onDidExit @ workbench.main.js:3183
fire @ workbench.main.js:77
_onExtHostProcessExit @ workbench.main.js:3316
_extensionHostProcess.on @ workbench.main.js:3311
emit @ events.js:182
ChildProcess._handle.onexit @ internal/child_process.js:237
2workbench.main.js:1406   ERR No file system provider found for ssh://aim-gpu-s/cnn-experiments/crnn-example.py: ENOPRO: No file system provider found for ssh://aim-gpu-s/cnn-experiments/crnn-example.py
    at b.<anonymous> (file:///Applications/Visual Studio Code.app/Contents/Resources/app/out/vs/workbench/workbench.main.js:3336:910)
    at Generator.next (<anonymous>)
    at r (file:///Applications/Visual Studio Code.app/Contents/Resources/app/out/vs/workbench/workbench.main.js:34:454)

@rchiodo
Copy link
Contributor

rchiodo commented Jun 18, 2019

@simplelife2010 do you have a possible repro case? If not, was the long running cell outputting anything? It's hard to tell what caused the extension host to run out of memory.

We could probably fix the reattach, but we'd need more info to figure out what's causing the memory problem.

@simplelife2010
Copy link
Author

simplelife2010 commented Jun 19, 2019

@rchiodo Yes, I have. It's not precisely my initial setup (which would require specific training data to reproduce), but I found a simpler setup using code from an official Tensorflow tutorial. The console output is a little different so I will attach it again. The extension host terminates after about 10 minutes. This is the code I run on the remote Jupyter host (an AWS EC2 instance):

#%%
!pip install tensorflow==1.14.0

#%%
import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=1000)
model.evaluate(x_test, y_test)

@simplelife2010
Copy link
Author

This is the IPython output:

Collecting tensorflow==1.14.0
�[?25l  Downloading https://files.pythonhosted.org/packages/de/f0/96fb2e0412ae9692dbf400e5b04432885f677ad6241c088ccc5fe7724d69/tensorflow-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (109.2MB)
�[K     |████████████████████████████████| 109.2MB 426kB/s 
�[?25hRequirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.14.0) (1.0.7)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.14.0) (1.0.9)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.14.0) (1.1.0)
Collecting google-pasta>=0.1.6 (from tensorflow==1.14.0)
�
...

  Found existing installation: tensorflow-estimator 1.13.0
    Uninstalling tensorflow-estimator-1.13.0:
      Successfully uninstalled tensorflow-estimator-1.13.0
Successfully installed google-pasta-0.1.7 tensorboard-1.14.0 tensorflow-1.14.0 tensorflow-estimator-1.14.0rc1
�[33mWARNING: You are using pip version 19.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.�[0m






Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step
Epoch 1/1000
60000/60000 [==============================] - 4s 60us/sample - loss: 0.2167 - acc: 0.9354
Epoch 2/1000
60000/60000 [==============================] - 3s 57us/sample - loss: 0.0945 - acc: 0.9714
Epoch 3/1000
60000/60000 [==============================] - 3s 57us/sample - loss: 0.0683 - acc: 0.9792
Epoch 4/1000

...

Epoch 60/1000
60000/60000 [==============================] - 3s 57us/sample - loss: 0.0075 - acc: 0.9980
Epoch 61/1000
60000/60000 [==============================] - 3s 57us/sample - loss: 0.0077 - acc: 0.9980
Epoch 62/1000
60000/60000 [==============================] - 3s 57us/sample - loss: 0.0093 - acc: 0.9980
Epoch 63/1000
60000/60000 [==============================] - 3s 57us/sample - loss: 0.0064 - acc: 0.9983
Epoch 64/1000
60000/60000 [==============================] - 3s 57us/sample - loss: 0.0078 - acc: 0.9981
Epoch 65/1000
60000/60000 [==============================] - 3s 58us/sample - loss: 0.0056 - acc: 0.9985
Epoch 66/1000
20224/60000 [=========>....................] - ETA: 2s - loss: 0.0055 - acc: 0.9981
WARNING: Logging before flag parsing goes to stderr.
W0619 06:56:32.044821 140518287595328 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor

@simplelife2010
Copy link
Author

This is the console output:

[Extension Host] Info Python Extension: 2019-06-19 08:56:29: Wait for sys info for 9cbe973f-473d-4ae2-bceb-cac267ff7c73 0
workbench.main.js:3183 Extension host terminated unexpectedly. Code:  null  Signal:  SIGABRT
_onExtensionHostCrashed @ workbench.main.js:3183
workbench.main.js:2379 Extension host terminated unexpectedly.
onDidNotificationChange @ workbench.main.js:2379
workbench.main.js:3311 Extension Host
workbench.main.js:3311 FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 2: node::FatalError(char const*, char const*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 3: v8::internal::FatalProcessOutOfMemory(char const*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 4: v8::internal::FatalProcessOutOfMemory(char const*) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 5: v8::internal::Factory::NewRawOneByteString(int, v8::internal::PretenureFlag) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 6: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 7: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 8: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
 9: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
10: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
11: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
12: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
13: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
14: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
15: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
16: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
17: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
18: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
19: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
20: v8::internal::Isolate::random_number_generator() [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
21: v8::JSON::Parse(v8::Isolate*, v8::Local<v8::String>) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
22: v8::internal::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, v8::internal::Runtime::FunctionId) [/Applications/Visual Studio Code.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib]
23: 0x165a2ff8525d

<--- Last few GCs --->

[16461:0x7fd676012e00]   321680 ms: Mark-sweep 1801.6 (2080.6) -> 1801.6 (2080.6) MB, 33.4 / 0.0 ms  allocation failure GC in old space requested
[16461:0x7fd676012e00]   321721 ms: Mark-sweep 1801.6 (2080.6) -> 1801.6 (2079.6) MB, 41.5 / 0.0 ms  last resort GC in old space requested
[16461:0x7fd676012e00]   321756 ms: Mark-sweep 1801.6 (2079.6) -> 1801.6 (2079.6) MB, 34.9 / 0.0 ms  last resort GC in old space requested


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x1f7e5125709 <JSObject>
    0: builtin exit frame: parse(this=0x1f7e5119ce9 <Object map = 0x1f788604e11>,0x1f76804b0d9 <Very long string[1148114]>,0x1f7e5119ce9 <Object map = 0x1f788604e11>)

    1: postObservableNext [/Users/brunovetter/.vscode/extensions/ms-python.python-2019.5.18875/out/client/extension.js:~83] [pc=0x165a30264157](this=0x1f7a8d18619 <JSObject>,e=0x1f7d44fe779 <String[591]\: #%\nimport tensorflow as ...


6[Violation] Added non-passive event listener to a scroll-blocking <some> event. Consider marking event handler as 'passive' to make the page more responsive. See <URL>

@rchiodo
Copy link
Contributor

rchiodo commented Jun 20, 2019

Thanks that helps a lot.

@greazer greazer changed the title Remote Jupyter connection with long running command: Extension host terminated Remote Jupyter connection with long running command: Extension host terminated due to memory Jun 21, 2019
@IanMatthewHuff IanMatthewHuff self-assigned this Jul 10, 2019
@IanMatthewHuff
Copy link
Member

Hey @simplelife2010. I had actually just fixed an issue that looks like it might be similar to yours. The other issue is here:
https://github.com/microsoft/vscode-python/issues/6001

In my last post in that thread I provided a link to our development build, which should have the fix at this point if you want to try it out.

@rchiodo rchiodo assigned rchiodo and unassigned IanMatthewHuff Jul 11, 2019
@rchiodo
Copy link
Contributor

rchiodo commented Jul 11, 2019

This error might be different than the other bug. I'm getting a perf issue.

@rchiodo rchiodo closed this as completed Aug 6, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Aug 13, 2019
@microsoft microsoft unlocked this conversation Nov 14, 2020
@DonJayamanne DonJayamanne transferred this issue from microsoft/vscode-python Nov 14, 2020
@MariusMeiners
Copy link

We have this exact same issue - seems not to be solved as of today...

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants