Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fix: Prepared Statements Bind Variables Nil Error #5441

Merged
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions go/mysql/conn.go
Original file line number Diff line number Diff line change
Expand Up @@ -897,12 +897,9 @@ func (c *Conn) handleNextCommand(handler Handler) error {

if stmtID != uint32(0) {
defer func() {
// Allocate a new bindvar map every time since VTGate.Execute() mutates it.
prepare := c.PrepareData[stmtID]
if prepare.BindVars != nil {
for k := range prepare.BindVars {
prepare.BindVars[k] = nil
}
}
Comment on lines -901 to -905
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I agree that this is a cleaner and safer way of resetting PrepareData, I'm concerned that the fix might mask a deeper problem. Is it possible that there is a mismatch between ParamsCount and BindVars? Maybe we are being told to expect a certain number of bind variables (ParamsCount) but receiving a fewer number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so should we mask and report errors or something? or would rather have things immediately fail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepthi That's a good point. Maybe the problem is that we are reusing stmtID values for different statements?

@PrismaPhonic Were you able to reproduce the problem with the reduced test case?

Copy link
Member

@enisoc enisoc Nov 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, although this is cleaner and safer than setting everything to nil, there is a performance cost to that. It was probably written this way to avoid allocating a new map on every query. This "fix" will put more load on the GC, which is something we shouldn't do in the hot path unless there's a very good reason.

The assumption written into this optimization seems to be that for a given stmtID, the set of bind var names will never change. Does anyone know of a valid situation in which that assumption is violated? If there is no such situation, then @deepthi is right that there is another bug here that we shouldn't cover up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepthi will as far as I know we set the number of parameters when we check the :v prefix and send it back to the client. Therefore, if there was a mismatch I think the client might catch that.

@enisoc I dont know of a situation that will change the query and number of parameters for a given sstmtID.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK this finally makes sense. The clue about vtg1 in #5441 (comment) was the key to finding the root cause.

The vtg1 entry in the map is getting added by the normalizer after we pass the map to VTGate.Execute(). The way to reproduce this is to prepare a query that also has some raw values, such as the 'blah' in select foo from bar where name in (?,'blah'). The normalizer will replace the 'blah' with :vtg1 and add vtg1 to the bindvars map with value 'blah'.

After executing, we set all bindvar map values to nil without deleting them, which would have been fine if the normalizer had not mutated our map in-place to add vtg1. Instead, the next time we call VTGate.Execute(), we pass a bindvars map with vtg1 already present (and set to nil). The normalizer sees this and thinks you have a pre-existing bindvar called vtg1 so when it needs to generate a bindvar name for 'blah', it increments to vtg2 to avoid a collision. Now vtg1 remains set in the map with value nil.

The fix for that is what this PR is already doing: make a new map every time. The only suggestion I have then is to add a comment explaining why this is necessary:

// Allocate a new bindvar map every time since VTGate.Execute() mutates it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment with 4411cda.

prepare.BindVars = make(map[string]*querypb.BindVariable, prepare.ParamsCount)
}()
}

Expand Down
11 changes: 11 additions & 0 deletions go/vt/vtgate/logstats.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,10 @@ import (

"vitess.io/vitess/go/sqltypes"
"vitess.io/vitess/go/streamlog"
"vitess.io/vitess/go/tb"
"vitess.io/vitess/go/vt/callerid"
"vitess.io/vitess/go/vt/callinfo"
"vitess.io/vitess/go/vt/log"

querypb "vitess.io/vitess/go/vt/proto/query"
)
Expand Down Expand Up @@ -125,6 +127,15 @@ func (stats *LogStats) Logf(w io.Writer, params url.Values) error {
return nil
}

// FormatBindVariables call might panic so we're going to catch it here and print out the stack trace to help us
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't an uncaught panic already print a stack trace? Or is the problem that someone else has been recovering the panic somewhere else in the call stack?

Copy link
Contributor Author

@PrismaPhonic PrismaPhonic Nov 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that an uncaught panic already prints a stack trace. @sougou had asked me offline to use recover() to handle the place where the panic was occurring as an extra safety net. I was hoping he would comment here on the PR to clarify the full extent of what he wanted to achieve with the recover(). I've yet to have an opportunity to sync up with him about this part. Perhaps he meant without the explicit panic. @sougou?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I heard that @sougou can be found at the PlanetScale booth at KubeCon. You should try to find him there. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with him. He meant without an explicit panic and deepthi clarified that vtgate by design should never panic. Correcting now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted c4270d2. PTAL

// debug it better.
defer func() {
if x := recover(); x != nil {
log.Errorf("Uncaught panic:\n%v\n%s", x, tb.Stack(4))
log.Fatal()
}
}()

formattedBindVars := "\"[REDACTED]\""
if !*streamlog.RedactDebugUIQueries {
_, fullBindParams := params["full"]
Expand Down