Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryContext blocks when the network is down #160

Open
tiwo opened this issue Oct 14, 2023 · 5 comments
Open

QueryContext blocks when the network is down #160

tiwo opened this issue Oct 14, 2023 · 5 comments
Labels
Area - connections Issues affecting connection security or strings

Comments

@tiwo
Copy link

tiwo commented Oct 14, 2023

QueryContext blocks when the network is down

I want to deal with sometimes slow or unreliable network connections, and use QueryContext to limit query time.

After a connection is established (by Ping() or previous queries), and then fails (in my experiments I disconnect the VPN connection to the server), the next QueryContext hangs indefinitely even after the context is cancelled.

To reproduce, see https://github.com/tiwo/2023-gomssqldb-issue-with-timeout

% CONNECTIONSTRING='odbc:DRIVER={ODBC Driver 17 for SQL Server};SERVER={10.0.11.13};PORT={11433};DATABASE={ABC};USER ID={johndoe};PASSWORD={secret password};ApplicationIntent=ReadOnly;' \
go run main.go
03:10:39 main.go:68: main()
03:10:39 main.go:31: query_with_timeout("SELECT 'first';")
03:10:41 main.go:62: query_with_timeout returning "first"
03:10:41 main.go:91: to reproduce, disconnect from your network now
03:10:46 main.go:31: query_with_timeout("SELECT 'second';")
03:10:58 main.go:38: Cancelling query... state of ctx is: context.deadlineExceededError{})
03:11:29 main.go:84: Closing database
^C03:17:57 main.go:75: Received SIGINT; exiting.

Tested with Go 1.21.3, go-mssqldb 1.6.0.
For what it's worth, I am setting db.SetConnMaxLifetime which should be unrelated, and I am making sure the context really is cancelled.

Is the context even used in Stmt.sendQuery, except for logging?

@dlevy-msft dlevy-msft added the Area - connections Issues affecting connection security or strings label Oct 19, 2023
@shueybubbles
Copy link
Collaborator

Is this the same problem as #91?
PR #109 attempts to fix that but is failing all the tests.

@tiwo
Copy link
Author

tiwo commented Oct 23, 2023

no, not quite the same. I think #91 actually describes expected behaviour, not a bug. After all, Ping() and Query() must block as long as the "pong" is yet to be seen and the TCP connection is still alive (which is usually well after the network cable is disconnected). The solution to issue #91 is "replace Ping by PingContext to avoid the halting problem", I think.

But for PingContext and QueryContext it is different, I would expect for them to abort once the context is done, "no matter what". They shouldn't wait for a slow network. It is the same loop that blocks though. So this present issue (and the identical denisenkom#527) needs some unblocking.

I'm a bit out of my depth. My best lead is this: readCancelConfirmation would need to (receive as a parameter and) consider the context, and stop blocking there. Maybe the connection could just be discarded and closed. Or the incoming tokens up until the cancellation confirmation would have to be consumed in the background.

@tiwo
Copy link
Author

tiwo commented Oct 23, 2023

Completely untested code below. This acts on

  • the context being cancelled,
  • new tokens coming in (anything up until a doneStructis discarded),
  • or the token channel being closed (after a network error)
  • and blocks while neither of the three occurs.
// consume tokens until we get cancellation confirmation, or until context is done
func readCancelConfirmation(ctx context.Context, tokChan chan tokenStruct) bool {
	for {
		select {

		case <-ctx.Done():
			// this should in effect discard the connection. Debatable.
			return false

		case tok, alive := <-tokChan:

			if !alive {
				// tokChan is closed. 
				return false
			}

			confirmToken, isConfirmation := tok.(*doneStruct)
			if isConfirmation && (confirmToken.Status&doneAttn != 0) {
				// got cancellation confirmation, the connection is usable again
				return true
			}
		}
	}
}

Actually it is debatable if QueryContext may discard the connection just because the context is cancelled. It seems to depend on the semantics of "cancelling". For example, in an interactive environment like sqlcmd, the user could say "I see, the query takes too long. Abort this operation, but let me resume with the same connection." They'd have to wait then.

@navigator-dev
Copy link

Is there any update on this issue? Is there a current workaround, like moving all queries into goroutines? That's a bit tedious and I'm unsure if doing so would mean we get memory leaks.

@vasily-kirichenko
Copy link

@tiwo

I'm afraid we should cast to doneStruct, not to the pointer to it:

confirmToken, isConfirmation := tok.(doneStruct)

This is because the original function is:

func readCancelConfirmation(tokChan chan tokenStruct) bool {
	for tok := range tokChan {
		switch tok := tok.(type) {
		default:
		// just skip token
		case doneStruct: // the struct type, not the pointer to it
			if tok.Status&doneAttn != 0 {
				// got cancellation confirmation, exit
				return true
			}
		}
	}
	return false
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - connections Issues affecting connection security or strings
Projects
None yet
Development

No branches or pull requests

5 participants