-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Debugger] Adding debupgy as the ray debugger #42311
[Debugger] Adding debupgy as the ray debugger #42311
Conversation
The files do matches the diff based on my review, please verify end to end working before merging. |
I manually tested that the task can be paused and it awaits connection from debugger client. Tested tasks running on worker nodes and head node. I also tested post-mortem debugging is working Manually verified it's e2e working |
python/setup.py
Outdated
@@ -342,6 +342,7 @@ def get_packages(self): | |||
"aiosignal", | |||
"frozenlist", | |||
"requests", | |||
"debugpy", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably shouldn't install this by default; can we instead do a soft-import and raise a warning/exception if this is not installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! will do
python/ray/util/debugpy.py
Outdated
log.error( | ||
f"Module '{module}' cannot be loaded. " | ||
f"Ray Debugger will not work without '{module}'. " | ||
f"Install this module using 'pip install {module}' " | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried throwing an exception when import error occurred, but it doesn't work well. The exception will be captured by Ray and drop the program into post-mortem mode, which will try to import debugpy again and throw another import error again... And end up with multiple levels of exception all saying there is an import error. So I choose to just log the error message and skip debugger logic if the module is not available.
python/ray/util/debugpy.py
Outdated
def _try_import(module): | ||
try: | ||
return importlib.import_module(module) | ||
except ModuleNotFoundError: | ||
log.error( | ||
f"Module '{module}' cannot be loaded. " | ||
f"Ray Debugger will not work without '{module}'. " | ||
f"Install this module using 'pip install {module}' " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this specific, look for debugpy >= 1.8.0, and then raise warning if not found (and suggest to user to install debugpy==1.8.0)
python/ray/util/debugpy.py
Outdated
except ModuleNotFoundError: | ||
log.error( | ||
f"Module '{module}' cannot be loaded. " | ||
f"Ray Debugger will not work without '{module}'. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ray Debugpy Debugger
- enabled debugpy as the ray debugger for breakpoint and post_mortem debugging - added flag RAY_DEBUG=1 to enable debugpy. If RAY_DEBUG is not set and RAY_PDB is set, then rpdb will be used. - used state api to save worker debugging port.
- enabled debugpy as the ray debugger for breakpoint and post_mortem debugging - added flag RAY_DEBUG=1 to enable debugpy. If RAY_DEBUG is not set and RAY_PDB is set, then rpdb will be used. - used state api to save worker debugging port.
- enabled debugpy as the ray debugger for breakpoint and post_mortem debugging - added flag RAY_DEBUG=1 to enable debugpy. If RAY_DEBUG is not set and RAY_PDB is set, then rpdb will be used. - used state api to save worker debugging port.
The RAY_DEBUG environment variable name is so confusing that users will try to use the |
Why are these changes needed?
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.I manually tested that the task can be paused and it awaits connection from debugger client. Tested tasks running on worker nodes and head node. I also tested post-mortem debugging is working.