Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Windows service lifecycle events behave incorrectly when called within a golang environment #40167

Closed
JohanKnutzen opened this issue Jul 11, 2020 · 66 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Milestone

Comments

@JohanKnutzen
Copy link

JohanKnutzen commented Jul 11, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.3 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\knutz\AppData\Local\go-build
set GOENV=C:\Users\knutz\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\knutz\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\tmp\test2\svcshutdownbug\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\knutz\AppData\Local\Temp\go-build509631438=/tmp/go-build -gno-record-gcc-switches

What did you do?

Upon further narrowing of a minimal example that reproduces 40157, it seems that the golang runtime is preventing emitting of SERVICE_CONTROL_SHUTDOWN events.

I've written a program in c as well as in golang that logs this event. The order of syscalls are identical.

Source code

main.go
package main

import (
	"fmt"
	"os"
	"syscall"
	"unsafe"
)

const ()

const (
	SERVICE_WIN32_OWN_PROCESS = uint32(0x00000010)

	SERVICE_CONTROL_STOP     = uint32(0x00000001)
	SERVICE_CONTROL_SHUTDOWN = uint32(0x00000005)

	SERVICE_ACCEPT_STOP     = uint32(0x00000001)
	SERVICE_ACCEPT_SHUTDOWN = uint32(0x00000004)

	SERVICE_STOPPED       = uint32(0x00000001)
	SERVICE_START_PENDING = uint32(0x00000002)
	SERVICE_STOP_PENDING  = uint32(0x00000003)
	SERVICE_RUNNING       = uint32(0x00000004)
)

type SERVICE_TABLE_ENTRY struct {
	name *uint16
	proc uintptr
}

type SERVICE_STATUS struct {
	dwServiceType             uint32
	dwCurrentState            uint32
	dwControlsAccepted        uint32
	dwWin32ExitCode           uint32
	dwServiceSpecificExitCode uint32
	dwCheckPoint              uint32
	dwWaitHint                uint32
}

var (
	advapi32                         = syscall.NewLazyDLL("Advapi32.dll")
	procStartServiceCtrlDispatcher   = advapi32.NewProc("StartServiceCtrlDispatcherW")
	procRegisterServiceCtrlHandlerEx = advapi32.NewProc("RegisterServiceCtrlHandlerExW")
	procSetServiceStatus             = advapi32.NewProc("SetServiceStatus")
)

const svcName = "svctest"

var stopped = make(chan bool)

func log(text string) {
	filename := os.TempDir() + "\\" + "svctest.log"
	f, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0600)
	if err != nil {
		panic(err)
	}
	defer f.Close()

	if _, err = f.WriteString(text); err != nil {
		panic(err)
	}
}

func ctrlHandlerEx(ctrlCode uint32, eventType uint32, eventData uintptr, context uintptr) uintptr {
	switch ctrlCode {
	case SERVICE_CONTROL_STOP:
		log(fmt.Sprintf("SERVICE_CONTROL_STOP\n"))
		stopped <- true
	case SERVICE_CONTROL_SHUTDOWN:
		log(fmt.Sprintf("SERVICE_CONTROL_SHUTDOWN\n"))
		stopped <- true
	}

	return 0
}

func serviceMain(argc uint32, argv **uint16) uintptr {
	statusHandle, _, err := procRegisterServiceCtrlHandlerEx.Call(uintptr(unsafe.Pointer(syscall.StringToUTF16Ptr(svcName))),
		syscall.NewCallback(ctrlHandlerEx),
		uintptr(0))

	if 0 == statusHandle {
		log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err))
		return 0
	}
	status := SERVICE_STATUS{
		dwServiceType:      SERVICE_WIN32_OWN_PROCESS,
		dwControlsAccepted: (SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN),
		dwCurrentState:     SERVICE_START_PENDING,
	}

	ret, _, err := procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	status.dwCurrentState = SERVICE_RUNNING
	ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	<-stopped

	status.dwCurrentState = SERVICE_STOPPED
	ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	return 0
}

func main() {
	t := []SERVICE_TABLE_ENTRY{
		{
			name: syscall.StringToUTF16Ptr(svcName),
			proc: syscall.NewCallback(serviceMain),
		},
		{name: nil, proc: 0},
	}

	ret, _, err := procStartServiceCtrlDispatcher.Call(uintptr(unsafe.Pointer(&t[0])))
	if ret == 0 {
		log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err))
	}
}
main.c
#include <Windows.h>
#include <tchar.h>
#include <stdio.h>


SERVICE_STATUS g_ServiceStatus = { 0 };
SERVICE_STATUS_HANDLE g_StatusHandle = NULL;
HANDLE g_ServiceStopEvent = INVALID_HANDLE_VALUE;

VOID WINAPI ServiceMain(DWORD argc, LPTSTR *argv);
DWORD WINAPI ServiceWorkerThread(LPVOID lpParam);
DWORD WINAPI ctrlHandlerEx(DWORD CtrlCode, DWORD eventType, LPVOID eventData, LPVOID context);

#define SERVICE_NAME _T("svctest")

VOID WINAPI ServiceMain(DWORD argc, LPTSTR *argv) {

  g_StatusHandle = RegisterServiceCtrlHandlerEx(SERVICE_NAME, ctrlHandlerEx,
    NULL);

  if (g_StatusHandle == NULL) {
    goto EXIT;
  }

  // Tell the service controller we are starting
  ZeroMemory( & g_ServiceStatus, sizeof(g_ServiceStatus));

  g_ServiceStatus.dwServiceType = SERVICE_WIN32_OWN_PROCESS;
  g_ServiceStatus.dwControlsAccepted = SERVICE_ACCEPT_STOP
    | SERVICE_ACCEPT_POWEREVENT | SERVICE_ACCEPT_SESSIONCHANGE
    | SERVICE_ACCEPT_SHUTDOWN;
  g_ServiceStatus.dwCurrentState = SERVICE_START_PENDING;
  g_ServiceStatus.dwWin32ExitCode = 0;
  g_ServiceStatus.dwServiceSpecificExitCode = 0;
  g_ServiceStatus.dwCheckPoint = 0;

  if (SetServiceStatus(g_StatusHandle, &g_ServiceStatus) == FALSE) {
    //...
  }

  g_ServiceStopEvent = CreateEvent(NULL, TRUE, FALSE, NULL);

  if (g_ServiceStopEvent == NULL) {
    //...
    goto EXIT;
  }

  g_ServiceStatus.dwCurrentState = SERVICE_RUNNING;
  g_ServiceStatus.dwWin32ExitCode = 0;
  g_ServiceStatus.dwCheckPoint = 0;

  if (SetServiceStatus(g_StatusHandle, &g_ServiceStatus) == FALSE) {
    //...
  }

  HANDLE hThread = CreateThread(NULL, 0, ServiceWorkerThread, NULL, 0, NULL);

  WaitForSingleObject(hThread, INFINITE);

  CloseHandle(g_ServiceStopEvent);

  g_ServiceStatus.dwCurrentState = SERVICE_STOPPED;
  g_ServiceStatus.dwWin32ExitCode = 0;
  g_ServiceStatus.dwCheckPoint = 3;

  if (SetServiceStatus(g_StatusHandle, &g_ServiceStatus) == FALSE) {

  }

EXIT: return;
}

void log(const char* str) 
{
  FILE *file = fopen("C:\\Windows\\Temp\\svctest.log", "a+");
  if(0 == file) {
    return;
  }
  fprintf(file, str);
  fclose(file);
}

DWORD WINAPI ctrlHandlerEx(DWORD CtrlCode, DWORD eventType, LPVOID eventData,
  LPVOID context) {
  switch (CtrlCode) {
  case SERVICE_CONTROL_STOP: 
    log("SERVICE_CONTROL_STOP\n\r");
    SetEvent(g_ServiceStopEvent);
    break;
  case SERVICE_CONTROL_SHUTDOWN:
    log("SERVICE_CONTROL_SHUTDOWN\n\r");
    SetEvent(g_ServiceStopEvent);
    break;
  default:
    break;
  }

  return 0;
}

DWORD WINAPI ServiceWorkerThread(LPVOID lpParam) {

  while (WaitForSingleObject(g_ServiceStopEvent, 0) != WAIT_OBJECT_0) {

    Sleep(3000);
  }

  return 0;
}

DWORD RunService() {
  SERVICE_TABLE_ENTRY serviceTable[] = { { SERVICE_NAME, ServiceMain },
    { 0, 0 } };

  if (StartServiceCtrlDispatcher(serviceTable)) {
    return 0;
  } else {
    DWORD erro = GetLastError();

    return 1;
  }
}

int _tmain(int argc, TCHAR *argv[]) {
  RunService();
  return 0;
}

Building
golang: go build
C: cl main.c /link Advapi32.lib.

Running

  1. Run sc create svctest binpath= _path_ in an administrator cmd.exe, path should be an absolute path pointing to either the c executable or the golang executable.
  2. Run sc start svctest to start the service
  3. Run shutdown /s. This will shut down the computer. Do not shutdown using the start-menu.

Cleanup
Run sc delete svctest to remove the service entry in Windows

What did you expect to see?

Both executables log SERVICE_CONTROL_STOP and SERVICE_CONTROL_SHUTDOWN to C:\Windows\Temp\svctest.log. The above steps should log a SERVICE_CONTROL_SHUTDOWN during shutdown of the computer.

What did you see instead?

Only the C program appropriately logs SERVICE_CONTROL_SHUTDOWN. You can test that both programs log SERVICE_CONTROL_STOP by right clicking on the service in the Task Manager and selecting Restart.

@JohanKnutzen
Copy link
Author

JohanKnutzen commented Jul 11, 2020

@zx2c4 Do you have any ideas? Although unrelated, I remember that you had some insight regarding 31685

@alexbrainman I remember that you also had insight into the runtime

@alexbrainman
Copy link
Member

@yohan1234

Thank you for creating this issue.

Is your issue dup of #40074 ?

Alex

@JohanKnutzen
Copy link
Author

@alexbrainman

Thanks for the response.

It's not really a dupe since my repro doesn't use x/sys/windows/svc but makes direct syscalls. I would assume that the fix to this would solve both tho!

Johan

@networkimprov
Copy link

cc @aclements

@andybons andybons added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows labels Jul 13, 2020
@andybons andybons added this to the Unplanned milestone Jul 13, 2020
@JohanKnutzen
Copy link
Author

@alexbrainman

I've found a workaround.

It seems that the runtime hooks SetConsoleCtrl in windows which leads to the termination of the program when the CTRL_SHUTDOWN_EVENT is received. The runtime does this here:

os_windows.go

By overriding the console ctrl handler and the CTRL_SHUTDOWN_EVENT event, you can disable the golang handler for this particular event. Here is a modifed example that does that:

main.go
package main

import (
	"fmt"
	"os"
	"syscall"
	"unsafe"
)

const ()

const (
	SERVICE_WIN32_OWN_PROCESS = uint32(0x00000010)

	SERVICE_CONTROL_STOP     = uint32(0x00000001)
	SERVICE_CONTROL_SHUTDOWN = uint32(0x00000005)

	SERVICE_ACCEPT_STOP     = uint32(0x00000001)
	SERVICE_ACCEPT_SHUTDOWN = uint32(0x00000004)

	SERVICE_STOPPED       = uint32(0x00000001)
	SERVICE_START_PENDING = uint32(0x00000002)
	SERVICE_STOP_PENDING  = uint32(0x00000003)
	SERVICE_RUNNING       = uint32(0x00000004)
)

type SERVICE_TABLE_ENTRY struct {
	name *uint16
	proc uintptr
}

type SERVICE_STATUS struct {
	dwServiceType             uint32
	dwCurrentState            uint32
	dwControlsAccepted        uint32
	dwWin32ExitCode           uint32
	dwServiceSpecificExitCode uint32
	dwCheckPoint              uint32
	dwWaitHint                uint32
}

var (
	advapi32                         = syscall.NewLazyDLL("Advapi32.dll")
	procStartServiceCtrlDispatcher   = advapi32.NewProc("StartServiceCtrlDispatcherW")
	procRegisterServiceCtrlHandlerEx = advapi32.NewProc("RegisterServiceCtrlHandlerExW")
	procSetServiceStatus             = advapi32.NewProc("SetServiceStatus")
)

const (
	CTRL_SHUTDOWN_EVENT = uint32(6)
)

var (
	kernel32                  = syscall.NewLazyDLL("kernel32.dll")
	procSetConsoleCtrlHandler = kernel32.NewProc("SetConsoleCtrlHandler")
)

func overrideCtrlShutdownEvent() error {
	n, _, err := procSetConsoleCtrlHandler.Call(
		syscall.NewCallback(func(controlType uint32) uint {
			if controlType == CTRL_SHUTDOWN_EVENT {
				log("CTRL_SHUTDOWN_EVENT\n")
				return 1
			}
			return 0
		}),
		1)

	if n == 0 {
		return err
	}
	return nil
}

const svcName = "svctest"

var stopped = make(chan bool)
var ctrlShutdownEvent = make(chan bool)

func log(text string) {
	filename := os.TempDir() + "\\" + "svctest.log"
	f, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0600)
	if err != nil {
		panic(err)
	}
	defer f.Close()

	if _, err = f.WriteString(text); err != nil {
		panic(err)
	}
}

func ctrlHandlerEx(ctrlCode uint32, eventType uint32, eventData uintptr, context uintptr) uintptr {
	switch ctrlCode {
	case SERVICE_CONTROL_STOP:
		log(fmt.Sprintf("SERVICE_CONTROL_STOP\n"))
		stopped <- true
	case SERVICE_CONTROL_SHUTDOWN:
		log(fmt.Sprintf("SERVICE_CONTROL_SHUTDOWN\n"))
		stopped <- true
	}

	return 0
}

func serviceMain(argc uint32, argv **uint16) uintptr {
	statusHandle, _, err := procRegisterServiceCtrlHandlerEx.Call(uintptr(unsafe.Pointer(syscall.StringToUTF16Ptr(svcName))),
		syscall.NewCallback(ctrlHandlerEx),
		uintptr(0))

	if 0 == statusHandle {
		log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err))
		return 0
	}
	status := SERVICE_STATUS{
		dwServiceType:      SERVICE_WIN32_OWN_PROCESS,
		dwControlsAccepted: (SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN),
		dwCurrentState:     SERVICE_START_PENDING,
	}

	ret, _, err := procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	status.dwCurrentState = SERVICE_RUNNING
	ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	overrideCtrlShutdownEvent()

	<-stopped

	status.dwCurrentState = SERVICE_STOPPED
	ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	return 0
}

func main() {
	t := []SERVICE_TABLE_ENTRY{
		{
			name: syscall.StringToUTF16Ptr(svcName),
			proc: syscall.NewCallback(serviceMain),
		},
		{name: nil, proc: 0},
	}

	ret, _, err := procStartServiceCtrlDispatcher.Call(uintptr(unsafe.Pointer(&t[0])))
	if ret == 0 {
		log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err))
	}
}

Specifically this bit:

const (
	CTRL_SHUTDOWN_EVENT = uint32(6)
)

var (
	kernel32                  = syscall.NewLazyDLL("kernel32.dll")
	procSetConsoleCtrlHandler = kernel32.NewProc("SetConsoleCtrlHandler")
)

func overrideCtrlShutdownEvent() error {
	n, _, err := procSetConsoleCtrlHandler.Call(
		syscall.NewCallback(func(controlType uint32) uint {
			if controlType == CTRL_SHUTDOWN_EVENT {
				log("CTRL_SHUTDOWN_EVENT\n")
				return 1
			}
			return 0
		}),
		1)

	if n == 0 {
		return err
	}
	return nil
}

If you run the above code using the instructions in the original post, the output should be the following:
CTRL_SHUTDOWN_EVENT
SERVICE_CONTROL_SHUTDOWN

I would suggest that the golang runtime is modified to not implicitly call SetConsoleCtrlHandler. This is because the documented "default handling" of these kinds of events in windows is unique depending on the application. You have a different behavior if you've supplied a ServiceMain, use gdi32/user32, running different versions of windows and of course if you're a console program at all. Using this function should unfortunately probably be an explicit action when writing go programs for Windows.

SetConsoleCtrlHandler

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 13, 2020

Or the service code can just call SetConsoleCtrlHandler(NULL, FALSE), to reset to the default handler:

image

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/242378 mentions this issue: windows/svc: restore default console shutdown notifier when starting service

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 13, 2020

I just posted a patch to maybe fix this, but I haven't tested.

@yohan1234 - can you test and let me know whether or not this works?

@JohanKnutzen
Copy link
Author

@zx2c4 - I've tried that because I thought that the docs were incorrect on msdn. Unfortunately, it only affects "Ctrl+C input" and not CTRL_SHUTDOWN_EVENT.

Here's a some modified code that tests that:

main.go
package main

import (
	"fmt"
	"os"
	"syscall"
	"unsafe"
)

const ()

const (
	SERVICE_WIN32_OWN_PROCESS = uint32(0x00000010)

	SERVICE_CONTROL_STOP     = uint32(0x00000001)
	SERVICE_CONTROL_SHUTDOWN = uint32(0x00000005)

	SERVICE_ACCEPT_STOP     = uint32(0x00000001)
	SERVICE_ACCEPT_SHUTDOWN = uint32(0x00000004)

	SERVICE_STOPPED       = uint32(0x00000001)
	SERVICE_START_PENDING = uint32(0x00000002)
	SERVICE_STOP_PENDING  = uint32(0x00000003)
	SERVICE_RUNNING       = uint32(0x00000004)
)

type SERVICE_TABLE_ENTRY struct {
	name *uint16
	proc uintptr
}

type SERVICE_STATUS struct {
	dwServiceType             uint32
	dwCurrentState            uint32
	dwControlsAccepted        uint32
	dwWin32ExitCode           uint32
	dwServiceSpecificExitCode uint32
	dwCheckPoint              uint32
	dwWaitHint                uint32
}

var (
	advapi32                         = syscall.NewLazyDLL("Advapi32.dll")
	procStartServiceCtrlDispatcher   = advapi32.NewProc("StartServiceCtrlDispatcherW")
	procRegisterServiceCtrlHandlerEx = advapi32.NewProc("RegisterServiceCtrlHandlerExW")
	procSetServiceStatus             = advapi32.NewProc("SetServiceStatus")
)

const (
	CTRL_SHUTDOWN_EVENT = uint32(6)
)

var (
	kernel32                  = syscall.NewLazyDLL("kernel32.dll")
	procSetConsoleCtrlHandler = kernel32.NewProc("SetConsoleCtrlHandler")
)

func overrideCtrlShutdownEvent() error {
	n, _, err := procSetConsoleCtrlHandler.Call(
		0, /* NULL */
		0 /* FALSE */)

	if n == 0 {
		return err
	}
	return nil
}

const svcName = "svctest"

var stopped = make(chan bool)
var ctrlShutdownEvent = make(chan bool)

func log(text string) {
	filename := os.TempDir() + "\\" + "svctest.log"
	f, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0600)
	if err != nil {
		panic(err)
	}
	defer f.Close()

	if _, err = f.WriteString(text); err != nil {
		panic(err)
	}
}

func ctrlHandlerEx(ctrlCode uint32, eventType uint32, eventData uintptr, context uintptr) uintptr {
	switch ctrlCode {
	case SERVICE_CONTROL_STOP:
		log(fmt.Sprintf("SERVICE_CONTROL_STOP\n"))
		stopped <- true
	case SERVICE_CONTROL_SHUTDOWN:
		log(fmt.Sprintf("SERVICE_CONTROL_SHUTDOWN\n"))
		stopped <- true
	}

	return 0
}

func serviceMain(argc uint32, argv **uint16) uintptr {
	statusHandle, _, err := procRegisterServiceCtrlHandlerEx.Call(uintptr(unsafe.Pointer(syscall.StringToUTF16Ptr(svcName))),
		syscall.NewCallback(ctrlHandlerEx),
		uintptr(0))

	if 0 == statusHandle {
		log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err))
		return 0
	}
	status := SERVICE_STATUS{
		dwServiceType:      SERVICE_WIN32_OWN_PROCESS,
		dwControlsAccepted: (SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN),
		dwCurrentState:     SERVICE_START_PENDING,
	}

	ret, _, err := procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	status.dwCurrentState = SERVICE_RUNNING
	ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	overrideCtrlShutdownEvent()

	<-stopped

	status.dwCurrentState = SERVICE_STOPPED
	ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status)))
	if ret == 0 {
		log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err))
	}

	return 0
}

func main() {
	t := []SERVICE_TABLE_ENTRY{
		{
			name: syscall.StringToUTF16Ptr(svcName),
			proc: syscall.NewCallback(serviceMain),
		},
		{name: nil, proc: 0},
	}

	ret, _, err := procStartServiceCtrlDispatcher.Call(uintptr(unsafe.Pointer(&t[0])))
	if ret == 0 {
		log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err))
	}
}

It fails to log any SERVICE_CONTROL_SHUTDOWN

The only way to properly nuke a handler is to call SetConsoleCtrlHandler with the original pointer used as the first argument and then FALSE as the second argument.

I don't really know what a good solution is here. You could save the original callback pointer and then have it accessible to allow some users to remove it. This would include people that use user32/gdi or writes user services. It seems unreasonable to expect people to know this though. I would suggest that the runtime avoids touching SetConsoleCtrlHandler at all. There's no general way of mapping Windows voodoo to unix signals.

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 14, 2020

I would suggest that the runtime avoids touching SetConsoleCtrlHandler at all. There's no general way of mapping Windows voodoo to unix signals.

I would have preferred that too, but perhaps that boat has already sailed, since we'd break API if we removed it.

However maybe there's a way to only call SetConsoleCtrlHandler lazily when an API that references it is used. I'll look into that. If not, then I'll see if I can play some games to recover that pointer.

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 14, 2020

However maybe there's a way to only call SetConsoleCtrlHandler lazily when an API that references it is used.

Nevermind, it's always used, as sigsend is generally useful within the runtime. However, there's this tidbit:


        if sigsend(s) {
                return 1
        }
        if !islibrary && !isarchive {
                // Only exit the program if we don't have a DLL.
                // See https://golang.org/issues/35965.
                exit(2) // SIGINT, SIGTERM, etc
        }
        return 0

Is your code hitting that exit(2) there? Or the return 1? If the exit(2), maybe we just need to add an additional condition.

@alexbrainman
Copy link
Member

I was thinking about adjusting runtime itself not to call SetConsoleCtrlHandler, if the process is running as service. What do you think?

Alex

@alexbrainman
Copy link
Member

This broke when we added CTRL_CLOSE_EVENT, CTRL_LOGOFF_EVENT, CTRL_SHUTDOWN_EVENT to Go SetConsoleCtrlHandler handler. But maybe we should not be calling SetConsoleCtrlHandler from the service at all?

Alex

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 14, 2020

I was thinking about adjusting runtime itself not to call SetConsoleCtrlHandler, if the process is running as service. What do you think?

Alex

That's what I initially tried doing. But this will break other things still where the signal handling is actually useful (crashes). I have another solution I'm working on that avoids bailing when in a service. I'm writing the code now to detect if we're running as a service. Tricky business.

@alexbrainman
Copy link
Member

I'm writing the code now to detect if we're running as a service.

Are you looking for golang.org/x/sys/windows/svc.IsAnInteractiveSession ?

Alex

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 14, 2020

Are you looking for golang.org/x/sys/windows/svc.IsAnInteractiveSession ?

Nope, not what we want, I don't think. The solution I'm writing is uglier, unfortunately. We'll see if it works. Halfway done now.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/242280 mentions this issue: runtime: detect services in signal handler

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 14, 2020

@yohan1234 Alright, try that latest CL. I haven't tested it for your case yet, but maybe it's correct.

@JohanKnutzen
Copy link
Author

Thanks a lot for the help.

Is your code hitting that exit(2) there? Or the return 1? If the exit(2), maybe we just need to add an additional condition.

I believe that it is hitting the exit(2). I don't understand why the exit(2) exists though. That's not the proper way of using that function. If the handler does not handle the signal it should return 0. Even if you wish the program to terminate, it's preferred to exit gracefully via the entry point. Windows takes care of killing of the process automatically if required. This happens after a grace period of 10 seconds after the function returns. This is dependent upon the event.

To test your patch, do I need to rebuild go or can I just pull in a local runtime package with that patch somehow? Sorry I'm not sure what the steps are.

Also, if anyone is using gdi/user32 with hwnds, I think that the program still crashes needlessly rather than unwinding via hwnds. You would have to write a similar snippet to ignore events in case hwnds have been created. Just something to consider for the future.

I would say that lazily loading it when the signals package is pulled in would be a nice solution. Pulling in the signals package would impose some kind of lifecycle management anyways.

@networkimprov
Copy link

networkimprov commented Jul 14, 2020

You can patch your local install in .../go/src/runtime/..., and simply build your app. The runtime and stdlib get linked into the executable.

Create a file with the patch in your installed .../go, and run git apply <patch>.

EDIT: you can download the patch file from the CL link above.

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 14, 2020

To test your patch, do I need to rebuild go or can I just pull in a local runtime package with that patch somehow? Sorry I'm not sure what the steps are.

You have to rebuild, but I can build you a .zip or .tar or something if you tell me what GOOS/GOARCH you want. But you're probably better off doing this yourself. Clone the repo and run ./make.bash or make.bat.

Also, if anyone is using gdi/user32 with hwnds, I think that the program still crashes needlessly rather than unwinding via hwnds. You would have to write a similar snippet to ignore events in case hwnds have been created. Just something to consider for the future.

I'm not sure I grok what you mean here. I maintain a large gdi/user32 codebase and haven't encountered problems. What weird behavior do you suspect I should be seeing?

@networkimprov
Copy link

networkimprov commented Jul 14, 2020

@zx2c4 the CL immediately above is in .../go/src/runtime/. That doesn't require a rebuild of Go.

@yohan1234, see my instructions above.

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 14, 2020

@zx2c4 the CL immediately above is in .../go/src/runtime/. That doesn't require a rebuild of Go.

True.

@JohanKnutzen
Copy link
Author

JohanKnutzen commented Jul 14, 2020

Thanks for the info @networkimprov
Thanks for the patch @zx2c4

The patch wasn't compatible with 1.14.5 or 1.14.4 so I rebuilt golang with the patch applied from the golang repo using make.bat.

Unfortunately it did not work. Either isWindowsService() segfaults, returns the wrong answer or the program crashes somewhere else. I rebuilt the main.go program in my original post which logs the service event if it works.

I'm not sure I grok what you mean here. I maintain a large gdi/user32 codebase and haven't encountered problems. What weird behavior do you suspect I should be seeing?

As stated in the docs:

If a console application loads the gdi32.dll or user32.dll library, the HandlerRoutine function that you specify when you call SetConsoleCtrlHandler does not get called for the CTRL_LOGOFF_EVENT and CTRL_SHUTDOWN_EVENT events. The operating system recognizes processes that load gdi32.dll or user32.dll as Windows applications rather than console applications.

If you depend on the signals package to process SIGTERM via SetConsoleCtrlHandler, that's just gonna stop working if you load gdi32/user32

If you don't depend on those events, you wouldn't have any problems. Just saying that it's probably not possible to walk this line and implement SIGTERM on windows via the signals package. It's just going to function properly if you write console programs.

@alexbrainman
Copy link
Member

Unfortunately it did not work.

I am not convinced your example program does not have bugs. For example, you should only call SetServiceStatus from your main service thread. You Go program is not doing that. Perhaps there are other issues with your program.

Alex

@networkimprov
Copy link

networkimprov commented Jul 23, 2020

I'm not opposed to the patch; I didn't say I was. I'm opposed to dropping code into the runtime on the eve of a release. What is the hurry? How does attacking me make this an urgent fix? (And please stop being abusive when you hear disagreement; this is not LKML.)

A previous late-cycle Windows runtime patch produced #35447, and others.

The above is a policy matter, not a technical one. Re the latter, I offered some insight on the reason for exit(2), which you haven't addressed.

@networkimprov
Copy link

And re tests, I meant that the CL currently has none. It would be nice to know that Win-service apps work in subsequent releases.

Finding false positives obviously requires trials in a wide variety of Windows configurations, which we can't get for 1.15.

@JohanKnutzen
Copy link
Author

JohanKnutzen commented Jul 23, 2020

Thanks a lot guys for helping me out. These kinds of issues are really tough to resolve.

@networkimprov @zx2c4 All windows services run in session 0 since Vista. Golang requires Windows 7.

Have fun injecting into windows sessionId==0. :)

I'm probably misunderstanding you here but it's trivial to start a process in session 0. We do this all the time. In fact, it's encouraged by Microsoft because it's a sandbox with lots of wings clipped. Just run it from a service running in session 0. We run a bunch of powershell stuff from there.

It is much harder to start a process from session 0 in another session though. We do this too though, it just requires a bit of extra yanking of the Windows chain.

The lifecycle of a Windows program is hard to generalize. All golang programs ship with the console ctrl handler set. This is unexpected when writing services or user32/gdi32 programs. It raises a lot of ambiguity in how those programs react to lifecycle events. A service written in golang crashes upon shutdown, but not if user32 has been touched. This is not something that's obviously understood when interacting with the api. @zx2c4 patches solve this issue which is great.

If things were written from scratch though, I would argue to either a) remove the console ctrl handler entirely in the runtime or b) inject a hidden window into all golang programs that use the signals package to map WM_ENDSESSION/WM_CLOSE to whatever they need to do as these events are more predictable and remove ambiguity from having two paths for handling ctrl-c events.

@ianlancetaylor
Copy link
Member

I think the CL (242280) is safe, but I don't know how important it is to land at this late stage of the release cycle. I gave the CL a +1 and I'll let Alex decide whether to put it in for 1.15. Thanks.

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 23, 2020

Thanks a lot guys for helping me out. These kinds of issues are really tough to resolve.

@networkimprov @zx2c4 All windows services run in session 0 since Vista. Golang requires Windows 7.

Have fun injecting into windows sessionId==0. :)

I'm probably misunderstanding you here but it's trivial to start a process in session 0. We do this all the time. In fact, it's encouraged by Microsoft because it's a sandbox with lots of wings clipped. Just run it from a service running in session 0. We run a bunch of powershell stuff from there.

It is much harder to start a process from session 0 in another session though. We do this too though, it just requires a bit of extra yanking of the Windows chain.

Right, so the point is: when you do that, and the parent is a legitimate service, isWindowsService is telling the truth. And when you inject a foreign process without a service parent, but title it services.exe, and then start a child from there, what then can you say? Not much more than: you've just gone through great lengths to do something ridiculous that no real setup does, and the status of that process is mostly irrelevant. IOW, the false positive potential seems nil in cases that matter.

If things were written from scratch though, I would argue to either a) remove the console ctrl handler entirely in the runtime or b) inject a hidden window into all golang programs that use the signals package to map WM_ENDSESSION/WM_CLOSE to whatever they need to do as these events are more predictable and remove ambiguity from having two paths for handling ctrl-c events.

Not sure about hidden window injection; that seems a bit heavy, and also different. The console ctrl handler works by connecting as a CSR client, and passing its callback in to that machinery, and it's something the runtime always does even for the default handler (which just calls RtlExitUserProcess). But what might make sense instead could be just not installing it if the signals package isn't touched, or maybe only installing it in the case of the linker setting the console flag on the PE. Or some combination thereof. On the other hand, simply returning zero (the 1.16 CL) seems like the simplest fix. So after the 1.15 fix lands and 1.16 opens, maybe we can start to play with that a bit.

@JohanKnutzen
Copy link
Author

@zx2c4

Right, so the point is: when you do that, and the parent is a legitimate service, isWindowsService is telling the truth. And when you inject a foreign process without a service parent, but title it services.exe, and then start a child from there, what then can you say? Not much more than: you've just gone through great lengths to do something ridiculous that no real setup does, and the status of that process is mostly irrelevant. IOW, the false positive potential seems nil in cases that matter.

Right. The only other non Windows processes/services that I know of that run in session 0 are user mode drivers.

But what might make sense instead could be just not installing it if the signals package isn't touched, or maybe only installing it in the case of the linker setting the console flag on the PE. Or some combination thereof. On the other hand, simply returning zero (the 1.16 CL) seems like the simplest fix. So after the 1.15 fix lands and 1.16 opens, maybe we can start to play with that a bit.

That seems reasonable to me.

@alexbrainman
Copy link
Member

I think the CL (242280) is safe,

Thank you for checking, @ianlancetaylor

but I don't know how important it is to land at this late stage of the release cycle. I gave the CL a +1 and I'll let Alex decide whether to put it in for 1.15.

I think CL 242280 is important enough so it is available to Go 1.14 and 1.15 users, but is not important enough so it delays imminent Go 1.15 release. So, I propose, we merge CL 242280 on release-branch.go1.14 and release-branch.go1.15 (once Go 1.15 is released) branches. And submit CL 243597 on master once Go 1.15 is released.

@gopherbot, please backport to Go 1.14 and Go 1.15. This is serious problem where Windows service does not receives stop / shutdown event.

@zx2c4 would you, please, copy CL 242280 onto release-branch.go1.14, so we can have it available as soon as possible. Or I can do it myself, if you like.

Thank you.

Alex

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #40411 (for 1.14), #40412 (for 1.15).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 27, 2020

@zx2c4 would you, please, copy CL 242280 onto release-branch.go1.14, so we can have it available as soon as possible. Or I can do it myself, if you like.

Will do.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/244957 mentions this issue: [release-branch.go1.14] runtime: detect services in signal handler

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/244958 mentions this issue: [release-branch.go1.14] runtime: detect services in signal handler

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/244959 mentions this issue: [release-branch.go1.15] runtime: detect services in signal handler

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 27, 2020

Sorry for the churn. The two backports are https://golang.org/cl/244958 and https://golang.org/cl/244959.

@dmitshur
Copy link
Contributor

dmitshur commented Aug 1, 2020

Does this issue affect Go 1.13 in addition to 1.14, or was it introduced in 1.14? We need to know this because the minor release policy requires an issue is fixed in all affected supported releases (see #34536).

@alexbrainman
Copy link
Member

Does this issue affect Go 1.13 in addition to 1.14, or was it introduced in 1.14?

Go 1.13 is good. We introduced this bug in Go 1.14. See #40074 (comment) for details.

Alex

@dmitshur dmitshur modified the milestones: Unplanned, Go1.16 Aug 22, 2020
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 22, 2020
@dmitshur
Copy link
Contributor

dmitshur commented Aug 22, 2020

@alexbrainman I see that the fix for this issue intended for Go 1.16 (CL 243597) involves removing the exit(2) case completely, while the backport CLs for 1.15 and 1.14 both add a && !isWindowsService() case and preserve it. The commit message of CL 243597 says:

In this CL, contrary to CL 244959 [1.15 backport CL], we do not need
to special case services with expensive detection algorithms, or rely
on hard-coded library/archive flags.

It says it does that, but it doesn't seem to explain why.

I'm not opposed to this approach, just looking to understand this better. This is a long thread and from reading it, my brief understanding so far is that the Go 1.16 fix is better but may need additional work/testing, while the Go 1.15+1.14 backports are larger code changes but deemed safer/smaller difference in behavior. Or is it something else? Thank you.

Edit: Comments #40167 (comment) and #40167 (comment) confirm my understanding is accurate.

@alexbrainman
Copy link
Member

I'm not opposed to this approach, just looking to understand this better. This is a long thread and from reading it, my brief understanding so far is that the Go 1.16 fix is better but may need additional work/testing, while the Go 1.15+1.14 backports are larger code changes but deemed safer/smaller difference in behavior. Or is it something else? Thank you.

That is exactly correct.

Already submitted fix on current master has not many lines of code, but is riskier - it removes exit(2) and assumes that Windows will do the right thing on its own. And it affects all Go programs. This fix is more "logically correct", but is new, and needs more time to be tested / verified.

Backports to go1.14 and go1.15 have more lines of code, but they only affects Windows service programs (limited set of all Windows programs - these programs are specifically designed to be started when Windows boots, and they have to comply with a specific protocol, because of that). And the change just removes code that was not intended to run in Windows service #40074 (comment)

I hope it explains.

Alex

@dmitshur
Copy link
Contributor

dmitshur commented Aug 22, 2020

Thanks very much. The backport issues have been approved, so it's a matter of getting the cherry pick CLs +2ed, run trybots, and then a release manager can submit them. We are targeting September 1st or so for the next minor releases.

gopherbot pushed a commit that referenced this issue Aug 22, 2020
The service handler needs to handle CTRL+C-like events -- including
those sent by the service manager itself -- using the default Windows
implementation if no signal handler from Go is already listening to
those events. Ordinarily, the signal handler would call exit(2), but we
actually need to allow this to be passed onward to the service handler.
So, we detect if we're in a service and skip calling exit(2) in that
case, just like we do for shared libraries.

Updates #40167.
Updates #40074.
Fixes #40412.

Change-Id: Ia77871737a80e1e94f85b02d26af1fd2f646af96
Reviewed-on: https://go-review.googlesource.com/c/go/+/244959
Run-TryBot: Jason A. Donenfeld <Jason@zx2c4.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
gopherbot pushed a commit that referenced this issue Aug 22, 2020
The service handler needs to handle CTRL+C-like events -- including
those sent by the service manager itself -- using the default Windows
implementation if no signal handler from Go is already listening to
those events. Ordinarily, the signal handler would call exit(2), but we
actually need to allow this to be passed onward to the service handler.
So, we detect if we're in a service and skip calling exit(2) in that
case, just like we do for shared libraries.

Updates #40167.
Updates #40074.
Fixes #40411.

Change-Id: Ia77871737a80e1e94f85b02d26af1fd2f646af96
Reviewed-on: https://go-review.googlesource.com/c/go/+/244958
Run-TryBot: Jason A. Donenfeld <Jason@zx2c4.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
@golang golang locked and limited conversation to collaborators Aug 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Projects
None yet
Development

No branches or pull requests

8 participants