Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/sys/windows/svc/example: does not start #40160

Closed
macklin-10x opened this issue Jul 10, 2020 · 18 comments
Closed

x/sys/windows/svc/example: does not start #40160

macklin-10x opened this issue Jul 10, 2020 · 18 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@macklin-10x
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.14.4 windows/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\chris.macklin\AppData\Local\go-build
set GOENV=C:\Users\chris.macklin\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\chris.macklin\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\Users\chris.macklin\Downloads\sys\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\CHRIS~1.MAC\AppData\Local\Temp\go-build041286302=/tmp/go-build -gno-record-gcc-switches

What did you do?

What did you expect to see?

From the debug invocation, beeping and logging to the command line.

From the start invocation, the service starts, beeps periodically, and logs "beep" at info level to the event log.

What did you see instead?

The debug invocation runs correctly, logs to the command line, and beeps.

The start invocation returns no error, but the service fails to start.

"starting myservice service" is logged to the Application event log at info level.
"The my service service terminated unexpectedly." is logged to the System event log at error level.

Trying to start the service from the GUI service manager provides the same error.

@gopherbot gopherbot added this to the Unreleased milestone Jul 10, 2020
@dmitshur dmitshur changed the title x/sys/windows/svc/example does not start x/sys/windows/svc/example: does not start Jul 11, 2020
@dmitshur
Copy link
Contributor

/cc @alexbrainman @bradfitz per owners.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 11, 2020
@alexbrainman
Copy link
Member

@macklin-10x thank you for reporting this issue, but I cannot reproduce it here.

I am using this commit

commit ddb9806d33aed8dbaac1cd6f1cba58952e87f933 (HEAD, origin/master, origin/HEAD, master)
Author: Tobias Klauser <tklauser@distanz.ch>
Date:   Thu Jun 25 20:26:03 2020 +0200

    unix: add FS_IOC_SETFLAGS on linux

    Change-Id: I455c008a114a017a6ecedb134f9af17d14dd249b
    Reviewed-on: https://go-review.googlesource.com/c/sys/+/240037
    Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: Matt Layher <mdlayher@gmail.com>

with this go version

go version devel +3a43226 Fri Jul 10 11:32:36 2020 +0000 linux/amd64

and I use this version of Windows

image

What do you use?

Thank you.

Alex

@macklin-10x
Copy link
Author

I've reproduced this on a second machine. I'm using the same commit from sys (current tip of master, same commit as you listed). Here are system specs:

Screen Shot 2020-07-11 at 1 01 15 PM

@macklin-10x
Copy link
Author

Well, to make the issue even stranger: if I configure the service to start automatically and reboot the machine, the service starts on boot and runs correctly. If I then stop the service and start it again, it fails to start with the aforementioned error. I'm beginning to think this is some obscure issue with Windows rather than anything in the go source.

@alexbrainman
Copy link
Member

I'm beginning to think this is some obscure issue with Windows rather than anything in the go source.

I am not convinced it is some Windows problem. For example, we have #23479 where people report random problems. It could be bug in Go code.

If you can reproduce the problem, please, try to understand where the problem is.

Unfortunately, it is not easy to debug Go service. The only way I know is by adding source lines to log into Event Log. I doubt you could use debugger, because service is meant to act on events within a 30 seconds or something. Also, if service process crashes, I do not know of a away to tell service manager about it, so you would have to reboot computer.

There is also #40074 - but that bug is about shutdown event - which is different from what you describe.

Sorry I could not be more helpfull.

Alex

@macklin-10x
Copy link
Author

Since I've already got the example building, I'll try adding a bunch of debug print statements and see if I can uncover a root cause. It's rather confusing as I already adding at least a top-level panic handler and it isn't catching a panic.

@macklin-10x
Copy link
Author

I logged the execution path all the way down to a call to syscall.Syscall which seems to never return; the call originates in svc.service.run, s.goWaits.Wait()

goroutine 6 [running]:
runtime/debug.Stack(0xc000042208, 0x1, 0x508e99)
	/usr/local/go/src/runtime/debug/stack.go:24 +0xa4
golang.org/x/sys/windows.WaitForSingleObject(0x16c, 0xffffffff, 0xc00008c020, 0x1b, 0x0)
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/zsyscall_windows.go:1317 +0x70
golang.org/x/sys/windows/svc.(*event).Wait(0xc000042248, 0x1, 0x5069f8)
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/svc/event.go:40 +0xd9
golang.org/x/sys/windows/svc.(*service).run(0xc00005a080)
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/svc/service.go:237 +0x83
created by golang.org/x/sys/windows/svc.Run
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/svc/service.go:381 +0x392

The arguments passed to syscall.Syscall are
sys.Syscall(%!s(uintptr=140708308983808), %!s(int=2), %!s(uintptr=364), %!s(uintptr=4294967295), %!s(int=0))

Just below the call to svc.service.run in svc.Run, the call to windows.StartServiceCtrlDispatcher(&t[0]) also seems to never return.

I'm not sure if this is because the syscalls are never returning or for some reason the whole process is getting torn down before they have a chance to return. I added panic loggers to both goroutines but neither seem to be panicking.

@alexbrainman
Copy link
Member

Since I've already got the example building, I'll try adding a bunch of debug print statements and see if I can uncover a root cause.

Please, try do that. Maybe we can solve #40074 .

It's rather confusing as I already adding at least a top-level panic handler and it isn't catching a panic.

I am not sure what do you mean by that. But I would not be surprised that panic does not work in some situations - we are too low level. Maybe just add log statements, and log to either file or Windows Event Log.

I logged the execution path all the way down to a call to syscall.Syscall which seems to never return; the call originates in svc.service.run, s.goWaits.Wait()

goroutine 6 [running]:
runtime/debug.Stack(0xc000042208, 0x1, 0x508e99)
	/usr/local/go/src/runtime/debug/stack.go:24 +0xa4
golang.org/x/sys/windows.WaitForSingleObject(0x16c, 0xffffffff, 0xc00008c020, 0x1b, 0x0)
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/zsyscall_windows.go:1317 +0x70
golang.org/x/sys/windows/svc.(*event).Wait(0xc000042248, 0x1, 0x5069f8)
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/svc/event.go:40 +0xd9
golang.org/x/sys/windows/svc.(*service).run(0xc00005a080)
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/svc/service.go:237 +0x83
created by golang.org/x/sys/windows/svc.Run
	/Users/chris.macklin/go/github.com/macklin-10x/sys/windows/svc/service.go:381 +0x392

This is first line in svc.service.run, It waits for s.goWaits to be signalled. s.goWaits is signaled in servicemain in sys_amd64.s. And servicemain is called by Windows via windows.StartServiceCtrlDispatcher. windows.StartServiceCtrlDispatcher is passed servicemain address.

Just below the call to svc.service.run in svc.Run, the call to windows.StartServiceCtrlDispatcher(&t[0]) also seems to never return.

StartServiceCtrlDispatcher is not meant to return. It will return when service ends, or if there is an error configuring service.

Perhaps there is a bug in servicemain in sys_amd64.s. Mind you I am not sure how to debug it. Maybe call some print function from servicemain ?

Perhaps @zx2c4 has some ideas on how to debug this.

Alex

@macklin-10x
Copy link
Author

Any further input on how to debug this? All of our Windows systems reproduce this bug reliably.

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 28, 2020

Any further input on how to debug this? All of our Windows systems reproduce this bug reliably.

I'm waiting to see more apt panics, by following what Alex suggested.

Do the usual recover block, but write to a file or event log, so that you can retrieve a log and a stacktrace.

If that fails, attach a debugger so that crashes at least result in a break and you can inspect the crashing thread.

@macklin-10x
Copy link
Author

I already added blocks that recover a panic, write it to the event log, and repanic at the top of every goroutine invocation, and haven't observed output from any of them.

I don't understand how to attach a debugger to the process started by the service manager.

@alexbrainman
Copy link
Member

I already added blocks that recover a panic, write it to the event log, and repanic at the top of every goroutine invocation, and haven't observed output from any of them.

It just shows that none of your goroutine panics. But you might have code blocked in syscall, your code might crash without panic. I would suggest add logging statement all over you code, starting from the start of your program and continue until you find where your program crashes or blocks or whatever. We normally do this with println or fmt.Prtintf. But, because it is a service, you cannot output to standard output. So make your logging function write to a disk file or to Windows Event log. Perhaps even close file after every write to make sure there are no unsaved changes to be lost, if process crashes or something.

I don't understand how to attach a debugger to the process started by the service manager.

Unfortunately I don't know how to use debuggers. Especially debugger with Windows service program. You can see some instructions of me using WinDbg #36492. But WinDbg will not print Go symbols. For that you need to sue Delve debugger. Delve provides command line flag to pass PID to attach to a running process.

Perhaps @zx2c4 might give some debugging suggestions. But I would try to log things first. If you can reproduce the problem easily, you should be able to point exact place in the program where things go wrong. Then we might be able to guess where the bug is.

Thank you.

Alex

@zx2c4
Copy link
Contributor

zx2c4 commented Jul 29, 2020

Just attach to the service PID with windbg, and then immediately press "continue". If you find the service is crashing too soon, insert 10 second sleep between the process startup and the call to the start the service control. During those 10 seconds, attach windbg to the PID.

@adam-azarchs
Copy link
Contributor

I've been helping @macklin-10x with debugging this on the test machine, and my current best guess about what is going on is that this is a bad interaction between the go runtime and something being injected into the process by some malware monitoring software. I probably won't have a chance to work on this more today but will update this further if I can get more detail on what's going on there. Obviously, from a certain perspective, if AV is breaking things by forcibly injecting its own threads into the process that isn't exactly Go's fault, but on the other hand there may be something simple that can be done to mitigate the issue, given that AV of some kind is standard practice on Windows.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/246317 mentions this issue: windows/svc: align stack in servicemain before calling syscall

@alexbrainman
Copy link
Member

@macklin-10x and @adam-azarchs can you, please, try this change

https://golang.org/cl/246317

Does it help with your problem? Thank you.

Alex

@macklin-10x
Copy link
Author

macklin-10x commented Aug 1, 2020

With this patch the example service runs correctly without bypassing our antivirus software!

@alexbrainman
Copy link
Member

With this patch the example service runs correctly without bypassing our antivirus software!

Thanks for checking. I just submitted the change.

Alex

@golang golang locked and limited conversation to collaborators Aug 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants