Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug: provide a way to intercept debug session stop for debuggee to shutdown gracefully #120

Open
ramya-rao-a opened this issue May 28, 2020 · 23 comments
Labels
Debug Issues related to the debugging functionality of the extension. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.

Comments

@ramya-rao-a
Copy link
Contributor

As discussed in microsoft/vscode-go#2387, there is a lack of support for graceful shutdown when debugging. This issue is to investigate what can be done to improve this experience.

@stamblerre stamblerre changed the title Lack of support for graceful shutdown when debugging debug: shutdown gracefully Jun 3, 2020
@stamblerre stamblerre added Debug Issues related to the debugging functionality of the extension. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 3, 2020
@hyangah
Copy link
Contributor

hyangah commented Sep 21, 2020

This year, this extension switched to use the tree-kill's kill

const forceCleanup = async () => {

export function killProcessTree(
p: ChildProcess,
logger?: (...args: any[]) => void): Promise<void> {
if (!logger) {
logger = console.log;
}
if (!p || !p.pid || p.exitCode !== null) {
return Promise.resolve();
}
return new Promise((resolve) => {
kill(p.pid, (err) => {
if (err) {
logger(`Error killing process ${p.pid}: ${err}`);
}
resolve();
});
});
}

According to tree-kill's doc, it uses SIGTERM by default.

And as discussed in #601
Currently the shutdown process first sends the halt msg and then triggers the showdown process after a certain timeout.

const timeoutToken: NodeJS.Timer =
isLocalDebugging &&
setTimeout(async () => {
log('Killing debug process manually as we could not halt delve in time');
await forceCleanup();
resolve();
}, 1000);
let haltErrMsg: string;
try {
await this.callPromise('Command', [{ name: 'halt' }]);
} catch (err) {
log('HaltResponse');
haltErrMsg = err ? err.toString() : '';
log(`Failed to halt - ${haltErrMsg}`);
}
clearTimeout(timeoutToken);

I wonder if the current timeout 1000ms is sufficient yet.
Is there any other aspect to be investigated?

@hyangah
Copy link
Contributor

hyangah commented Sep 28, 2020

@suzmue for reproduce the issue. (for both new/old adapters)

@suzmue
Copy link
Contributor

suzmue commented Oct 5, 2020

I am able to reproduce this in the old adapter with the same repro case from microsoft/vscode-go#2387 on darwin:

main.go:

package main

import (
	"fmt"
	"os"
	"os/signal"
	"syscall"
)

func main() {
	sigs := make(chan os.Signal, 1)
	signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
	select {
	case sig := <-sigs:
		fmt.Printf("Shutdown %v\n", sig)
	}
}

launch.json:

{
    "version": "0.2.0",
    "configurations": [
      {
        "name": "Launch Package",
        "type": "go",
        "request": "launch",
        "program": "${workspaceFolder}",
        "trace": "verbose",
      },
    ]
  }

Log output:

2020-10-5, 18:45:45.777 UTC
[18:45:45.777 UTC] From client: initialize({"clientID":"vscode","clientName":"Visual Studio Code","adapterID":"go","pathFormat":"path","linesStartAt1":true,"columnsStartAt1":true,"supportsVariableType":true,"supportsVariablePaging":true,"supportsRunInTerminalRequest":true,"locale":"en-us","supportsProgressReporting":true})
[18:45:45.778 UTC] InitializeRequest
[18:45:45.778 UTC] To client: {"seq":0,"type":"response","request_seq":1,"command":"initialize","success":true,"body":{"supportsConfigurationDoneRequest":true,"supportsSetVariable":true}}
[18:45:45.778 UTC] InitializeResponse
[18:45:45.778 UTC] From client: launch({"name":"Launch Package","type":"go","request":"launch","program":"/Users/suzmue/simpleproj","trace":"verbose","debugServer":4711,"packagePathToGoModPathMap":{"/Users/suzmue/simpleproj":"/Users/suzmue/simpleproj",".":"","/Users/suzmue/simpleproj/.vscode":"/Users/suzmue/simpleproj"},"apiVersion":2,"dlvLoadConfig":{"followPointers":true,"maxVariableRecurse":1,"maxStringLen":64,"maxArrayValues":64,"maxStructFields":-1},"showGlobalVariables":false,"dlvToolPath":"/Users/suzmue/go/bin/dlv","env":{"ELECTRON_RUN_AS_NODE":"1","USER":"suzmue","PATH":"/Users/suzmue/sdk/go1.15.2/bin:/usr/local/git/current/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/go/bin:/Users/suzmue/go/bin","LOGNAME":"suzmue","SSH_AUTH_SOCK":"/private/tmp/com.apple.launchd.Ku9W1KLlQA/Listeners","HOME":"/Users/suzmue","SHELL":"/bin/bash","__CF_USER_TEXT_ENCODING":"0x74A6A:0x0:0x0","TMPDIR":"/var/folders/2x/hlwbwjzd1vd4brlfs80676fw00fkl_/T/","XPC_SERVICE_NAME":"com.microsoft.VSCode.4936","XPC_FLAGS":"0x0","ORIGINAL_XDG_CURRENT_DESKTOP":"undefined","VSCODE_NLS_CONFIG":"{\"locale\":\"en-us\",\"availableLanguages\":{},\"_languagePackSupport\":true}","VSCODE_NODE_CACHED_DATA_DIR":"/Users/suzmue/Library/Application Support/Code/CachedData/58bb7b2331731bf72587010e943852e13e6fd3cf","VSCODE_LOGS":"/Users/suzmue/Library/Application Support/Code/logs/20201002T125751","VSCODE_IPC_HOOK":"/Users/suzmue/Library/Application Support/Code/1.49.1-main.sock","VSCODE_PID":"39138","SK_SIGNING_PLUGIN":"gnubbyagent","PWD":"/","SHLVL":"1","GITHUB_TOKEN":"ddbc999f4ed9821cac55486819352b1df81b89e4","GOPATH":"/Users/suzmue/go","_":"/Applications/Visual Studio Code.app/Contents/MacOS/Electron","AMD_ENTRYPOINT":"vs/workbench/services/extensions/node/extensionHostProcess","PIPE_LOGGING":"true","VERBOSE_LOGGING":"true","VSCODE_IPC_HOOK_EXTHOST":"/var/folders/2x/hlwbwjzd1vd4brlfs80676fw00fkl_/T/vscode-ipc-80a39026-6819-4464-b434-b8425a011558.sock","VSCODE_HANDLES_UNCAUGHT_ERRORS":"true","VSCODE_LOG_STACK":"true","APPLICATION_INSIGHTS_NO_DIAGNOSTIC_CHANNEL":"true","GOPROXY":"https://proxy.golang.org,direct","GOMODCACHE":"/Users/suzmue/go/pkg/mod"},"__sessionId":"50cf2401-fcca-4d1b-99b3-912192d27d28"})
[18:45:45.778 UTC] LaunchRequest
[18:45:45.788 UTC] Using GOPATH: /Users/suzmue/go
[18:45:45.789 UTC] Using GOROOT: /Users/suzmue/sdk/go1.15.2
[18:45:45.789 UTC] Using PATH: /Users/suzmue/sdk/go1.15.2/bin:/usr/local/git/current/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/go/bin:/Users/suzmue/go/bin
[18:45:45.790 UTC] Current working directory: /Users/suzmue/simpleproj
[18:45:45.790 UTC] Running: /Users/suzmue/go/bin/dlv debug --headless=true --listen=127.0.0.1:2252 --api-version=2
[18:45:46.374 UTC] To client: {"seq":0,"type":"event","event":"output","body":{"category":"stdout","output":"API server listening at: 127.0.0.1:2252\n"}}
[18:45:46.578 UTC] To client: {"seq":0,"type":"event","event":"initialized"}
[18:45:46.578 UTC] InitializeEvent
[18:45:46.578 UTC] To client: {"seq":0,"type":"response","request_seq":2,"command":"launch","success":true}
[18:45:46.581 UTC] From client: configurationDone(undefined)
[18:45:46.581 UTC] ConfigurationDoneRequest
[18:45:46.801 UTC] Changing DebugState from Halted to Running
[18:45:46.801 UTC] To client: {"seq":0,"type":"response","request_seq":3,"command":"configurationDone","success":true}
[18:45:46.801 UTC] ConfigurationDoneResponse {"seq":27,"type":"response","request_seq":3,"command":"configurationDone","success":true}
[18:45:46.804 UTC] From client: threads(undefined)
[18:45:46.804 UTC] To client: {"seq":0,"type":"response","request_seq":4,"command":"threads","success":true,"body":{"threads":[{"id":1,"name":"Dummy"}]}}

\\ PRESS STOP BUTTON

[18:45:56.536 UTC] From client: disconnect({"restart":false})
[18:45:56.537 UTC] DisconnectRequest
[18:45:56.537 UTC] HaltRequest
[18:45:56.543 UTC] continue state {"Running":false,"Recording":false,"currentThread":{"id":10228455,"pc":140735123519618,"file":"","line":0,"goroutineID":0,"ReturnValues":null},"Threads":[{"id":10228497,"pc":140735123519618,"file":"","line":0,"goroutineID":0,"ReturnValues":null},{"id":10228498,"pc":140735123519618,"file":"","line":0,"goroutineID":0,"ReturnValues":null},{"id":10228499,"pc":140735123519618,"file":"","line":0,"goroutineID":0,"ReturnValues":null},{"id":10228500,"pc":140735123519618,"file":"","line":0,"goroutineID":0,"ReturnValues":null},{"id":10228455,"pc":140735123519618,"file":"","line":0,"goroutineID":0,"ReturnValues":null},{"id":10228495,"pc":140735123519618,"file":"","line":0,"goroutineID":0,"ReturnValues":null},{"id":10228496,"pc":140735123511326,"file":"","line":0,"goroutineID":0,"ReturnValues":null}],"NextInProgress":false,"exited":false,"exitStatus":0,"When":""}
[18:45:56.544 UTC] DetachRequest
[18:45:56.557 UTC] DisconnectRequest to parent
[18:45:56.557 UTC] To client: {"seq":0,"type":"response","request_seq":5,"command":"disconnect","success":true}
[18:45:56.557 UTC] DisconnectResponse
[18:45:56.560 UTC] [Error] Process exiting with code: 0
[18:45:56.562 UTC] Sending TerminatedEvent as delve is closed
[18:45:56.562 UTC] To client: {"seq":0,"type":"event","event":"terminated"}

@jimbobmcgee
Copy link

I posited the idea of a dedicated button for sending specific signals over at microsoft/vscode#110203, so there might be a standardised distinction between stopping debugging and sending SIGINT/SIGTERM, but I might not be explaining myself well-enough over there (and I'm not qualified enough in VSCode internals to validate the advice I am given there).

If VSCode offered a different event to indicate an arbitrary signal, and tied the sending of that event to the pressing of a new button on the debug toolbar, could the Go debug provider use that new event to (posix) send SIGINT/SIGTERM, or (win32) invoke GenerateConsoleCtrlEvent and, if so, would that be an reasonable solution to the problem?

If so, perhaps if that request could gain some traction?

@lucky-wolf
Copy link

lucky-wolf commented Dec 5, 2020

FYI -- I added a capture all signals listener to my app - and from VSCode - running through debugger - delve clearly never sends any signals to its debugee ever. Never. Of any kind. There is no solution to this but for that software to be improved or replaced.


I'm am very confused reading all of this...

Fundamentally, my goal is to respond to the shutdown request - and clean up gracefully - in a golang executable on AMD 64 / Linux hosts.

I am getting SIGINT when run from console - but I cannot find any signal that gets sent when I try to "stop" a debug session launched from VSCode (see below).

I've taken a few stabs at this - but in the end, my process exists immediately before I ever have a chance to respond to this event (sigterm).

Not sure how to improve on this situation?


code:

var signals = []os.Signal{
	syscall.SIGINT,
	syscall.SIGQUIT,
	syscall.SIGABRT,
	syscall.SIGKILL,
	syscall.SIGTERM,
	syscall.SIGSTOP,
}
	osNotify = make(chan os.Signal, 1)
	signal.Notify(osNotify, signals...)
	sig := <-osNotify
	// we never reach here if this is launched from VSCode using debug - and then using the stop button...
	signal.Stop(osNotify)

Version: 1.52.0-insider
Commit: 6026ab576dc6a4fbb8e255241a364816f42464c5
Date: 2020-11-23T05:43:29.805Z
Electron: 9.3.3
Chrome: 83.0.4103.122
Node.js: 12.14.1
V8: 8.3.110.13-electron.0
OS: Linux x64 5.4.0-56-generic snap

@gopherbot gopherbot added this to the Untriaged milestone Apr 8, 2021
@hyangah hyangah modified the milestones: Untriaged, Backlog Apr 14, 2021
@gopherbot
Copy link
Collaborator

Change https://golang.org/cl/315151 mentions this issue: src/goDebugFactory: send SIGINT to delve and avoid treekill

gopherbot pushed a commit that referenced this issue Apr 30, 2021
dlv handles SIGINT better than SIGTERM.

And, stop using treekill. Treekill sends signal to dlv and its children
(debugee and their children, and debugserver). It doesn't look like
the debugserver on the mac doesn't seem to finish cleaning up the
target before exiting as a response of SIGINT. The target becomes a
child of the launchd and remains a zombie. Even when debugserver
could handle this gracefully, sending signals to all the children
processes may interfere with debug target's attempt to gracefully
terminate and doesn't seem like a good idea. Trust dlv's termination
handling logic instead.

Testing this beyond manual testing is hard. Instead, just check
whether the temporary binary is removed that is an indirect sign
that delve responded to the signal.

Changed DelveDAPOutputAdapter's dispose to take an optional timeout
parameter - to use it in testing.
And this CL fixes some places in goDebugFactory where null logger object
can cause to crash.

For go-delve/delve#2445
Updates #120

Change-Id: I72de903b27b12ed51079d1c21be89c41d8189d8b
Reviewed-on: https://go-review.googlesource.com/c/vscode-go/+/315151
Trust: Hyang-Ah Hana Kim <hyangah@gmail.com>
Trust: Suzy Mueller <suzmue@golang.org>
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: kokoro <noreply+kokoro@google.com>
Reviewed-by: Suzy Mueller <suzmue@golang.org>
lggomez pushed a commit to lggomez/vscode-go-1 that referenced this issue May 6, 2021
dlv handles SIGINT better than SIGTERM.

And, stop using treekill. Treekill sends signal to dlv and its children
(debugee and their children, and debugserver). It doesn't look like
the debugserver on the mac doesn't seem to finish cleaning up the
target before exiting as a response of SIGINT. The target becomes a
child of the launchd and remains a zombie. Even when debugserver
could handle this gracefully, sending signals to all the children
processes may interfere with debug target's attempt to gracefully
terminate and doesn't seem like a good idea. Trust dlv's termination
handling logic instead.

Testing this beyond manual testing is hard. Instead, just check
whether the temporary binary is removed that is an indirect sign
that delve responded to the signal.

Changed DelveDAPOutputAdapter's dispose to take an optional timeout
parameter - to use it in testing.
And this CL fixes some places in goDebugFactory where null logger object
can cause to crash.

For go-delve/delve#2445
Updates golang#120

Change-Id: I72de903b27b12ed51079d1c21be89c41d8189d8b
Reviewed-on: https://go-review.googlesource.com/c/vscode-go/+/315151
Trust: Hyang-Ah Hana Kim <hyangah@gmail.com>
Trust: Suzy Mueller <suzmue@golang.org>
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: kokoro <noreply+kokoro@google.com>
Reviewed-by: Suzy Mueller <suzmue@golang.org>
@hyangah
Copy link
Contributor

hyangah commented Apr 4, 2022

As #120 (comment) and microsoft/vscode#110203 (comment) described, it would be nice if the delve dap implements the Terminate request

  1. User clicks the stop button
  2. VS Code sends Terminate request to dlv dap
  3. dlv dap sends a SIGINT to the debugee - unlike Ctrl+C through headless server in terminal, dlv will have more explicit control on which process the signal is sent to even when there is a gdb or debugserver in between dlv and the debugged program.
    • Q. what should it do if debugee is paused due to breakpoint?
  4. The debugee exercises its signal handling
  5. If the debugee terminates itself, Terminate Event, and the usual termination process follows.
  6. If the debugee doesn't terminate, user may request one more Stop, which will terminate the debugee forcefully.
    • Q. will there be an option for users to send another signal?

That way, in step 4, debugee will have a chance to run the cleanup code, or even users can debug the clean up process.

Optionally, it would be nice if users can specify the signal to send to the debugee.

@polinasok
Copy link
Contributor

polinasok commented Apr 7, 2022

@aarzilli Please see comment on Dec 5, 2020 Is there a way for a user to intercept when debugger detaches and kills debuggee to execute some shutdown/cleanup sequence? Is it always SIGKILL?

@polinasok polinasok changed the title debug: shutdown gracefully debug: provide a way to intercept debug session stop for debuggee to shutdown gracefully Apr 7, 2022
@aarzilli
Copy link
Contributor

aarzilli commented Apr 8, 2022

Yes, it's always SIGKILL. Note that the feature requested here is to make the "stop" button send a SIGINT (or SIGTERM, or whatever catchable signal) and then resume the target process and wait for it to finish. Basically to make "stop" into another continue button. At that point you will have the opposite problem: a misbehaving program can not be terminated because it discards the signal we send (or gets stuck cleaning up).

@hyangah
Copy link
Contributor

hyangah commented Apr 8, 2022

At that point you will have the opposite problem: a misbehaving program can not be terminated because it discards the signal we send (or gets stuck cleaning up).

@aarzilli can you clarify this further? I think DAP clients will send a different request (Terminate) before Disconnect request if the server claims it supports Terminate requests. So, in my sketch, DAP will simply send a signal to the debuggee directly. (if the debuggee was paused already, maybe continue before sending the signal, I don't know). There is no need of intercepting debugger's detaching/killing of the debuggee.

If the debuggee doesn't handle the signal, that's ok - the DAP client will send the final Disconnect request and terminate it with SIGKILL. If the debuggee handles the signal, that's great.

Typically a debug adapter implements ‘terminate’ by sending a software signal which the debuggee intercepts in order to clean things up properly before terminating itself.

Please note that this request does not directly affect the state of the debug session: if the debuggee decides to veto the graceful shutdown for any reason by not terminating itself, then the debug session will just continue.

I think it's almost equivalent of resuming the target and running kill -s TERM <target_pid>, but asking DAP to do so.

Am I missing something?

@aarzilli
Copy link
Contributor

aarzilli commented Apr 8, 2022

Interesting, it does sound like the terminate request should not terminate the debuggee. Seeing that I think the dap server should be changed to respond to a terminate request by doing approximately the same thing it does for a continue request, plus sending a signal. And then it's up to the client to decide what to do if the debuggee does not terminate.
I'm not sure what's supposed to happen on windows, the specification talks about signals but they don't exist on windows.

@polinasok
Copy link
Contributor

I checked what actually happens in vscode. With supportsTerminateRequest, the first time you click stop, it sends a terminate request. If that doesn't result in termination, debug session continues. The user must then click stop one more time to actually kill things.

### Click Stop

[Trace - 19:52:51.528] client -> {"command":"terminate","arguments":{"restart":false},"type":"request","seq":13}

[Trace - 19:52:51.529] client  <- {"seq":0,"type":"response","request_seq":13,"success":true,"command":"terminate"}

### Click Stop again

[Trace - 19:53:03.434] client -> {"command":"disconnect","arguments":{"restart":false},"type":"request","seq":14}

Should users be able to specify what signal to send with the TerminateRequest (e.g. SIGTERM or SIGINT)?
Should this be opt-in so those users who don't have signal handling logic continue relying on dlv killing the process?
Users who don't have signal handling needs should never need click Stop twice.

@polinasok polinasok added DA: DlvDAP NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 9, 2022
@hyangah
Copy link
Contributor

hyangah commented Apr 9, 2022

Should users be able to specify what signal to send with the TerminateRequest (e.g. SIGTERM or SIGINT)?
Should this be opt-in so those users who don't have signal handling logic continue relying on dlv killing the process?
Users who don't have signal handling needs should never need click Stop twice.

@polinasok I agree that defaulting to this behavior is not great. How about introducing a new property e.g. TerminateSignal and enabling supportsTerminateRequest only if the property is set?

Another approach I thought about was just default to SIGKILL, so effectively the first TerminateRequest triggers killing of the target process. However, I am afraid there will be subtlety that's difficult to handle correctly (e.g. what if the target process is already suspended, etc..)

I'm not sure what's supposed to happen on windows, the specification talks about signals but they don't exist on windows.

@aarzilli I don't know what to do about Windows either. I think for windows users we ask them to start the process in a separate console, attach to it (attach configuration), and utilize Window's Ctrl+C\Ctrl+Break support for console processes. I initially wished the new console attribute helped to automate this setup, but I am not sure if the headless delve server also detaches itself from the console when starting the target process.

@aarzilli
Copy link
Contributor

I initially wished the new console attribute helped to automate this setup, but I am not sure if the headless delve server also detaches itself from the console when starting the target process.

No. We don't do anything but we probably should do something. I don't think Ctrl-C works correctly on a non-headless instance of delve on windows, for example. I'm not familiar enough, however.

@aronchick
Copy link

aronchick commented Apr 12, 2022

My dream user scenario - the "debug" bar had a button/key combo to mimic the ctrl-c functionality.

CleanShot 2022-04-12 at 17 21 17@2x

You can mimic the expected behavior with a simple command:

pkill -SIGINT __debug_bin

@jimbobmcgee
Copy link

@aronchick -- this is kind of what I was trying to gain traction for with microsoft/vscode#110203, but I was thinker wider than just *nix+SIGINT.

The debug bar should offer a unified way to send signals (or other OS-specific messages) to the code being debugged.

Debugger implementations should then be able to register the signals it supports (so the debug bar's dropdown/splitbutton could be populated); and then send the signal on to the debugged code when the button is clicked.

But I couldn't make my point well enough over there, so it was auto-closed.

@hyangah hyangah removed the DA: DlvDAP label Jun 7, 2022
@stefanbschneider
Copy link

Not sure if it belongs here, but it seems related:
I have a Python test case that starts a visualization web server in the background, which blocks the test from finishing. When I want to stop the visualization and the test, I click on VSCode's "Cancel Test Run" under "Testing", which stops the test but not the visualization web server. So the next time I run the test, the web server is still running and the address is still in use, causing an error.

Doing the same thing with PyCharm works: The test also blocks, but stopping all tests also stops the Python web server that was started in the background as part of the test.

My workaround is now to manually look for all Python processes and kill them each time I want to restart the test. A bit cumbersome.

@stiller-leser
Copy link

I just lost more time to this than I care to admit (before coming across this issue). It seems that other implementations have found a way: microsoft/vscode-node-debug2#101

It would be great to see something like this for Go!

@mattjohnsonpint
Copy link

Almost a year since the last comment. Any update on this?

@raylillywhite
Copy link

Looks like supportsTerminateRequest needs to be enabled for non-windows machines. microsoft/vscode-node-debug2@82b0830

@heiningair
Copy link

This is still needed! Any idea if it gets on the feature list some time soon?

@RelicOfTesla
Copy link

RelicOfTesla commented Sep 18, 2024

Perhaps we can use custom options in settings.json/deveConfig/gopls?

@ZhouShuren1881
Copy link

We need it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Debug Issues related to the debugging functionality of the extension. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests