Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

when I debug, I meet: Runtime Error: Already started in multiprocess #1761

Closed
civilman628 opened this issue Sep 11, 2019 · 22 comments
Closed

Comments

@civilman628
Copy link

civilman628 commented Sep 11, 2019

Environment data

  • PTVSD version: 4.3.2
  • OS and version: Ubuntu 18
  • Python version: 3.6.8 (I am using virtual env )
  • Using VS code 1.38

same as #1443

i already add these 2 lines in my code:

import multiprocessing
multiprocessing.set_start_method('spawn', True)
error message:

E00007.569: Exception escaped from start_client
            
            Traceback (most recent call last):
              File "/home/mingming/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/log.py", line 110, in g
                return f(*args, **kwargs)
              File "/home/mingming/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/pydevd_hooks.py", line 74, in start_client
                sock, start_session = daemon.start_client((host, port))
              File "/home/mingming/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/daemon.py", line 214, in start_client
                with self.started():
              File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
                return next(self.gen)
              File "/home/mingming/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/daemon.py", line 110, in started
                self.start()
              File "/home/mingming/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/daemon.py", line 145, in start
                raise RuntimeError('already started')
            RuntimeError: already started
            

Traceback (most recent call last):

my launch.json:

 {
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File (Integrated Terminal)",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal"
        },
        {
            "name": "Python: Remote Attach",
            "type": "python",
            "request": "attach",
            "port": 5678,
            "host": "localhost",
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}",
                    "remoteRoot": "."
                }
            ]
        },
        {
            "name": "Python: Module",
            "type": "python",
            "request": "launch",
            "module": "enter-your-module-name-here",
            "console": "integratedTerminal"
        },
        {
            "name": "Python: Django",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/manage.py",
            "console": "integratedTerminal",
            "args": [
                "runserver",
                "--noreload",
                "--nothreading"
            ],
            "django": true
        },
        {
            "name": "Python: Flask",
            "type": "python",
            "request": "launch",
            "module": "flask",
            "env": {
                "FLASK_APP": "app.py"
            },
            "args": [
                "run",
                "--no-debugger",
                "--no-reload"
            ],
            "jinja": true
        },
        {
            "name": "Python: Current File (External Terminal)",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "externalTerminal"
        }
    ]
}

settings.json

{
    "python.pythonPath": "venv/bin/python"
}

do i need to modify launch.json? When I debug, I can see my python venv in effect.

@civilman628 civilman628 changed the title when I debug I meet: Runtime Error: Already started in multiprocess when I debug, I meet: Runtime Error: Already started in multiprocess Sep 11, 2019
@karthiknadig
Copy link
Member

@civilman628 Which config from the launch,json are you using? I don't see "subProcess": true in any of the debug configurations.

@civilman628
Copy link
Author

@karthiknadig This is my question as well, in order to use python virtual environment for multiprocessing. How do i need to change launch.json? currently, only settings.json has my python path "python.pythonPath": "venv/bin/python"

@karthiknadig
Copy link
Member

karthiknadig commented Sep 11, 2019

Ah I see, for any configuration you are using add the "subProcess": true, setting. This will enable sub-process debugging in the python extension.

        {
            "name": "Python: Current File (Integrated Terminal)",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "subProcess": true,
        },

As long as the venv pat is set in the settings this should be good.

In your code, where you add these lines matters. Can you share your code? Are you using ptvsd.enable_attach() in your code?

import multiprocessing
multiprocessing.set_start_method('spawn', True)

@civilman628
Copy link
Author

I do not use ptvsd.enable_attach()

I can't paste all my code here, but here is the top part our my python file that i need to run, i am not sure it is enough.

import json
import logging
import os
import uuid
from collections import Counter
from json import JSONDecodeError
from time import time, sleep

import backoff as backoff
import boto3
from core_api_sdk.api import myCoreApi
from core_api_sdk.ds import PredictionRecord, PredictorConstants, PredictionRecordType
from core_api_sdk.events_fetcher import EventsFetcher
from core_api_sdk.real_time_users import RealTimePredictionUsersFetcher
from core_api_sdk.time import get_now_as_milliseconds
from flask import jsonify

from my_ai import settings
from my_ai.ds import MIEvent
from my_ai.log import basic_logging, log_ascii_signature, log_title
from my_ai.metrics import DogStatsdMetrics
from my_ai.predict import get_predictor_for_model_instance
from my_ai.train.params import ModelInstance
from my_ai.utils import mem

There is an init.py file in the same folder as of above file that contain the lines below

import time

import numpy as np
from core_api_sdk.ds import PredictorConstants

from my_ai.fv.base import DefaultLabeledPointGenerator
from my_ai.fv.uac import UserContextValueAsItemLPG
from my_ai.fv.tools.session import cut_session_to_event_horizon
from my_ai.metrics import Metrics

I should add the 2 lines below in which file and which section?

import multiprocessing
multiprocessing.set_start_method('spawn', True)

@karthiknadig
Copy link
Member

I am assuming you are using the following configuration to launch:

{
            "name": "Python: Current File (Integrated Terminal)",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "subProcess": true,
        },

It should go before anything thing gets imported. Try adding it before import json.

@civilman628
Copy link
Author

i add "subProcess": true, to the json file and add these 2 lines at the very top of the code, but the error is the same.

@civilman628
Copy link
Author

civilman628 commented Sep 11, 2019

I am running my python code in a Ubuntu VM that is hosted by windows 10. Is this config impact multiprocess?

@karthiknadig
Copy link
Member

No, it should not. Really what is happening is somewhere some library is calling os.fork or via native code. Or the multiprocessing.set_start_method('spawn', True) is not getting set early enough.

@fabioz Any ideas on how we can get the stack from where the fork is being called? looks like os * imports a bunch of functions including fork from posix. Then deletes posix. It looks like posix may be native, I don't see a way set a break point there.

@karthiknadig
Copy link
Member

Note that if os.fork is directly being called or it is using some native API. Then the spawn workaround will not work. We are working on a fix for it so the spawn workaround will not be needed.

@civilman628
Copy link
Author

I just have a check, my code does not has os.fork then it should be used by native API

@fabioz
Copy link
Contributor

fabioz commented Sep 12, 2019

@karthiknadig @civilman628

You can try to do (at the start of your code) something as:

import os
def dummy_fork(*args, **kwargs):
    raise AssertionError('here')
os.fork = dummy_fork

do a regular (non-debug) run and see if that's hit (it's probable that in this case the Python multiprocessing is not being used, in which case the usual workaround provided won't work).

The workaround in this case would be starting without debugging and do a remote attach directly in the subprocess (we're currently working on adding the os.fork support, but it's a big task and I'm not sure exactly when it'll be finished) -- you can see: https://code.visualstudio.com/docs/python/debugging#_remote-debugging for instructions for a remote attach.

@civilman628
Copy link
Author

civilman628 commented Sep 12, 2019

My python code has no issue to run in the terminal, which is out of VS code. I do not need to add the code above or these 2 lines. I only meet errors when debugging in vs code.

@civilman628
Copy link
Author

May I know if there is any rough estimation that how long this issue will be fixed?

@naefl
Copy link

naefl commented Oct 14, 2019

Same issue here - prior to the October update, this issue would only occur in cell debugging for interactive python sessions. Now this happens with normal debugging as well.

I'm using sklearn

              File "/root/.vscode-server-insiders/extensions/ms-python.python-2019.10.41019/pythonFiles/lib/python/old_ptvsd/ptvsd/daemon.py", line 110, in started
                self.start()
              File "/root/.vscode-server-insiders/extensions/ms-python.python-2019.10.41019/pythonFiles/lib/python/old_ptvsd/ptvsd/daemon.py", line 145, in start
raise RuntimeError('already started')
            RuntimeError: already started
            

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 16 concurrent workers.
Traceback (most recent call last):

Here's my launch.json:

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File (Integrated Terminal)",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "subProcess": true
        },
    ]
}

@int19h
Copy link
Contributor

int19h commented Oct 24, 2019

@naefl, can you tell more about the code that you're hitting this issue on - in particular, whether you're using os.fork, subprocess module, multiprocessing module, or any standard or third party module that might be wrapping one of those?

@naefl
Copy link

naefl commented Oct 24, 2019

Hi @int19h, thanks for looking into this - as said, using sklearn which utilizes multiprocessing.

I hit the error when running any code that involves sklearn.

@int19h
Copy link
Contributor

int19h commented Oct 24, 2019

My apologies, I didn't notice that part.

Did you get a chance to try multiprocessing.set_start_method() as described in the earlier comments? It's not a given that this would help here - I'm not sure whether sklearn uses that, or rolls its own parallelization - but it's worth a try.

Broadly speaking, these fork-related issues will be resolved once we complete #1706, which is in progress right now. Until then, we can only try to find workarounds for any particular library or scenario.

@naefl
Copy link

naefl commented Oct 24, 2019

I'll give it a try, thanks! Any ideas on order of magnitude of timeline, weeks, months?

I think it's important highlighting that Sklearn is probably the most used Machine Learning library, and with lots of VSCode's changes targeted towards Data Scientists I assume that a lot of users will run into this problem and give up before they end up reporting their issue here.

Either way, thanks a lot for all the awesome work you're doing.

@naefl
Copy link

naefl commented Oct 25, 2019

Same error with

import multiprocessing
multiprocessing.set_start_method('spawn', True)
  File "/root/.vscode-server-insiders/extensions/ms-python.python-2019.10.44104/pythonFiles/lib/python/old_ptvsd/ptvsd/multiproc.py", line 182, in notify_root
    conn.connect(('localhost', options.subprocess_notify))
ConnectionRefusedError: [Errno 111] Connection refused
[1]    2400 terminated  env PYTHONIOENCODING=UTF-8 PYTHONUNBUFFERED=1 /opt/conda/bin/python  --default

@int19h
Copy link
Contributor

int19h commented Oct 31, 2019

#1878 is the new multiproc implementation that should fix this issue.

@int19h int19h closed this as completed Oct 31, 2019
@civilman628
Copy link
Author

@int19h which version will include this fix?

@int19h
Copy link
Contributor

int19h commented Oct 31, 2019

@civilman628 The next alpha of ptvsd 5.0 will have it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants