sql: prepared statements cache parsed query #15639

jordanlewis · 2017-05-03T14:17:39Z

Previously, prepared statements contained only the text of their query, causing them to be re-parsed at execution time.

Now, they contain a parsed query.

This improves throughput of kv -read-percent=100 -concurrency=1 by about 15% on a local single-node cluster.

Updates #14927.

cockroach-teamcity · 2017-05-03T14:17:47Z

This change is

petermattis · 2017-05-03T14:20:39Z

Nice! Tagging with 1.1 as it is too late for this to go into 1.0. I didn't look at the change closely, but wanted to point out #14932 which also tackles this (though unsuccessfully).

tamird · 2017-05-03T14:22:02Z

Wow, that's an impressive improvement!

Can you help me understand why the first commit is necessary? Wouldn't it be valid to cache the parsed result with the placeholders substituted out?

Reviewed 3 of 3 files at r1.
Review status: 3 of 6 files reviewed at latest revision, all discussions resolved.

Comments from Reviewable

knz · 2017-05-03T14:28:46Z

Nice work indeed.

The first commit is necessary in that Eval should not panic on placeholders. But the code can be simplified, see my comments below.

Reviewed 3 of 3 files at r1, 3 of 3 files at r2.
Review status: all files reviewed at latest revision, 7 unresolved discussions.

pkg/sql/executor.go, line 507 at r2 (raw file):

// ExecuteStatementsParsed executes the given statement and returns a response.
func (e *Executor) ExecuteStatementParsed(

ExecutePreparedStatement is closer to the truth
since this is only executing prepared statements, what about giving the prepared statement key as argument and move the prepared statement lookup to here.

pkg/sql/executor.go, line 513 at r2 (raw file):

	session.phaseTimes[sessionStartBatch] = timeutil.Now()

	defer func() {

Make this deferred function a common helper. Possibly take inspiration from #15358 (still planning to merge that one eventually)

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

// Eval implements the TypedExpr interface.
func (node *Placeholder) Eval(ctx *EvalContext) (Datum, error) {
	return node.Value.Eval(ctx)

return ctx.Placeholders.Value(node.Name)

pkg/sql/parser/expr.go, line 616 at r1 (raw file):

type Placeholder struct {
	Name  string
	Value TypedExpr

remove this once Eval uses ctx to retrieve the value.

pkg/sql/parser/expr.go, line 633 at r1 (raw file):

	buf.WriteByte('$')
	buf.WriteString(node.Name)
	if node.Value != nil {

Not a fan of this. I don't see any benefits.

pkg/sql/parser/type_check.go, line 738 at r1 (raw file):

	// If there is a value known already, immediately substitute with it.
	if v, ok := ctx.Placeholders.Value(expr.Name); ok {
		expr.Value = v

See above.

pkg/sql/pgwire/v3.go, line 596 at r2 (raw file):

// type should return NoData.
func canSendNoData(stmt parser.Statement) bool {
	return stmt == nil || stmt.StatementType() != parser.Rows

Please revise this and above -- unfortunately Go also has empty but non-nil arrays. So you could have a parser.Statement reference that is non-nill but still should return canSendNoData true.

I think we should really have a StatementList reference in PreparedStatement, not Statement, and use that here too with len(stmt) == 0 instead of stmt == nil.

Comments from Reviewable

knz · 2017-05-03T14:29:39Z

@tamird in SQL you can execute a prepared statement multiple times with different placeholder values.

Review status: all files reviewed at latest revision, 7 unresolved discussions.

Comments from Reviewable

jordanlewis · 2017-05-03T14:29:49Z

Whoops, thanks for the pointer @petermattis as I totally missed your attempt.

@tamird without that patch, the following sequence of commands would return 3 both times, since executing a statement with placeholder values available actually modifies the parsed query:

PREPARE q AS SELECT $1;
EXECUTE q(3);
EXECUTE q(4);

Review status: all files reviewed at latest revision, 7 unresolved discussions.

Comments from Reviewable

tamird · 2017-05-03T14:30:55Z

Ah, of course. Thanks @knz @jordanlewis!

Reviewed 1 of 3 files at r2.
Review status: all files reviewed at latest revision, 7 unresolved discussions.

Comments from Reviewable

jordanlewis · 2017-05-03T19:58:17Z

TFTR @knz!

Review status: all files reviewed at latest revision, 7 unresolved discussions.

pkg/sql/executor.go, line 507 at r2 (raw file):

Previously, knz (kena) wrote…

ExecutePreparedStatement is closer to the truth

since this is only executing prepared statements, what about giving the prepared statement key as argument and move the prepared statement lookup to here.

Renamed. I think giving the key isn't quite right, because each of the calling methods do different things with the prepared statement.

pkg/sql/executor.go, line 513 at r2 (raw file):

Previously, knz (kena) wrote…

Make this deferred function a common helper. Possibly take inspiration from #15358 (still planning to merge that one eventually)

Done.

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

Previously, knz (kena) wrote…

return ctx.Placeholders.Value(node.Name)

I think the Eval is necessary because placeholders can contain Expressions now and not only Datums.

Unfortunately, this is also the wrong kind of ctx... the SemaContext has the placeholders, not the EvalContext. I don't think there's a way to get from one to another right now... thoughts?

I agree that your solution is cleaner - there's no point in mutating the expression tree to include the placeholder value.

pkg/sql/parser/expr.go, line 633 at r1 (raw file):

Previously, knz (kena) wrote…

Not a fan of this. I don't see any benefits.

Removed.

pkg/sql/pgwire/v3.go, line 596 at r2 (raw file):

Previously, knz (kena) wrote…

Please revise this and above -- unfortunately Go also has empty but non-nil arrays. So you could have a parser.Statement reference that is non-nill but still should return canSendNoData true.

I think we should really have a StatementList reference in PreparedStatement, not Statement, and use that here too with len(stmt) == 0 instead of stmt == nil.

I think it's misleading to have a StatementList within PreparedStatement, though, because it should only have size 1 maximum. And I can't find a case where stmt here is non-nil but also should return no data - when we create a PreparedStatement in the first place, if there's no statements to process, we return a PreparedStatement with a nil Statement inside... please show me what I'm missing?

Comments from Reviewable

knz · 2017-05-03T21:06:22Z

Reviewed 1 of 4 files at r3, 3 of 3 files at r4.
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I think the Eval is necessary because placeholders can contain Expressions now and not only Datums.

Unfortunately, this is also the wrong kind of ctx... the SemaContext has the placeholders, not the EvalContext. I don't think there's a way to get from one to another right now... thoughts?

I agree that your solution is cleaner - there's no point in mutating the expression tree to include the placeholder value.

so indeed return ctx.xxx.Get(node.Name).Eval(ctx)
The placeholder values need to be plumbed through the EvalContext, indeed.

Meanwhile, whether you make this change or not, you will also need to check that placeholder values are properly propagated to distsql processors. I suspect they are not, until you extend the distsql protobufs.

pkg/sql/pgwire/v3.go, line 596 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I think it's misleading to have a StatementList within PreparedStatement, though, because it should only have size 1 maximum. And I can't find a case where stmt here is non-nil but also should return no data - when we create a PreparedStatement in the first place, if there's no statements to process, we return a PreparedStatement with a nil Statement inside... please show me what I'm missing?

Ok, I can see how the current code limits this. I'll let it sleep until we start supporting stored procedures.

Comments from Reviewable

jordanlewis · 2017-05-05T21:48:11Z

Tests should pass now, and DistSQL should work correctly.

I discussed the best approach for marshalling placeholders over DistSQL with @RaduBerinde offline, and we came to the conclusion that it's easiest to replace placeholder values at serialization time rather than trying to send the placeholders to all relevant processors. That's what I've done here.

Review status: 0 of 9 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

Previously, knz (kena) wrote…

so indeed return ctx.xxx.Get(node.Name).Eval(ctx)
The placeholder values need to be plumbed through the EvalContext, indeed.

Meanwhile, whether you make this change or not, you will also need to check that placeholder values are properly propagated to distsql processors. I suspect they are not, until you extend the distsql protobufs.

I'm afraid that doing this correctly will be quite onerous, as many more values will need to be plumbed deep within DistSQL in all sorts of unexpected places (every time MakeExpression is called), so for the time being I've left the value expression within the Placeholder node.

I agree that it's a little messy to store the value on the Placeholder, not to mention dangerous since it is not cleared after execution, which exposes us to seeing old placeholder values if there are bugs in our code.

Comments from Reviewable

jordanlewis · 2017-05-05T21:50:58Z

Whoops, I spoke too soon about the tests.

andreimatei · 2017-05-05T22:00:09Z

Review status: 0 of 9 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I'm afraid that doing this correctly will be quite onerous, as many more values will need to be plumbed deep within DistSQL in all sorts of unexpected places (every time MakeExpression is called), so for the time being I've left the value expression within the Placeholder node.

I agree that it's a little messy to store the value on the Placeholder, not to mention dangerous since it is not cleared after execution, which exposes us to seeing old placeholder values if there are bugs in our code.

given what you said in the top-level discussion about how DistSQL will not need to deal with placeholders, can we now get rid of Placeholder.Value, and do things along the lines Rafa's suggesting?

Better yet, I think it'd be nice if we had a pass (a Visitor) that does the substitution like the TypeChecking used to. Since supposedly we already have something like that for the distsql rendering? Then we wouldn't need to complicate any EvalContexts.
Does that make sense?

Comments from Reviewable

jordanlewis · 2017-05-06T15:14:05Z

Review status: 0 of 9 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

given what you said in the top-level discussion about how DistSQL will not need to deal with placeholders, can we now get rid of Placeholder.Value, and do things along the lines Rafa's suggesting?

Better yet, I think it'd be nice if we had a pass (a Visitor) that does the substitution like the TypeChecking used to. Since supposedly we already have something like that for the distsql rendering? Then we wouldn't need to complicate any EvalContexts.
Does that make sense?

We can't naively get rid of Placeholder.Value without plumbing the placeholders map into everywhere MakeExpression is called, which is what I mentioned above

Your suggestion to use a Visitor is another possible solution but does introduce some questions. If we were to use a Visitor, which is basically what the old code did to replace Placeholders with their values, we'd again break the assumption that the parse tree stays unmodified after its initial creation. Alternatively, we could make a deep copy of the parse tree right before executing a prepared statement. Perhaps that is the correct solution here?

Comments from Reviewable

petermattis · 2017-05-06T15:38:16Z

Review status: 0 of 9 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

We can't naively get rid of Placeholder.Value without plumbing the placeholders map into everywhere MakeExpression is called, which is what I mentioned above

Your suggestion to use a Visitor is another possible solution but does introduce some questions. If we were to use a Visitor, which is basically what the old code did to replace Placeholders with their values, we'd again break the assumption that the parse tree stays unmodified after its initial creation. Alternatively, we could make a deep copy of the parse tree right before executing a prepared statement. Perhaps that is the correct solution here?

FYI, I experimented with making a deep copy of the parse tree (using reflection to make the copy). That was a non-starter from a performance perspective. Might want to verify that result, and not using reflection could change the numbers.

Comments from Reviewable

RaduBerinde · 2017-05-06T17:09:21Z

Review status: 0 of 9 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.

pkg/sql/parser/eval.go, line 2818 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

FYI, I experimented with making a deep copy of the parse tree (using reflection to make the copy). That was a non-starter from a performance perspective. Might want to verify that result, and not using reflection could change the numbers.

@jordanlewis Visitors don't normally change the expression in-place. When you return a new expression fromVisitPre/VisitPost, the parent nodes are copied as necessary so that the old tree remains valid. See WalkExpr.

I implemented this "immutable nodes model" as an alternative to implementing deep copy (which was even more tedious to do). There are various things that do mutate nodes in-place so we're not strictly adhering to the model, but those changes are (so far) harmless.

Comments from Reviewable

jordanlewis · 2017-05-06T21:28:57Z

Ah I see. Thanks. I was going to do a comparison between the mutable method I'm using currently and a new WalkExpr implementation, when I noticed that this patch now tanks the performance vs master.

I think the main change between now and when I first checked the performance is that DistSQL is always on, correct? I'll need to figure out why this is making things worse now.

RaduBerinde · 2017-05-06T21:39:27Z

The simple query done by kv shouldn't be running through DistSQL in auto mode. We even have a test for that (explain_distsql:25).

jordanlewis · 2017-05-07T02:55:31Z

I see. The trouble was that index selection expects placeholders to have been completely replaced by Datum values. That test is unfortunately insufficient because it doesn't try to use prepared queries, and we can't write an EXPLAIN test against prepared queries yet. I filed #15743 to add support for that.

I've redone this patch to use a visitor to replace the placeholders right before type checking. This is considerably simpler and doesn't require messing with the expression serialization code. PTAL.

Here are the new performance numbers. I don't see the 15% speedup anymore, now it's more like 5%. This is at least partially caused by the extra copies introduced by the visitor, but there's also considerable variance when testing this stuff locally.

Before:

[21:56]% ./kv -read-percent=100 -concurrency=1 -duration=10s
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)
      1s        0         2790.2         2790.1      0.5      1.4      3.3
      2s        0         3121.9         2955.6      0.4      0.7      4.5
      3s        0         3105.3         3005.5      0.4      0.7      4.1
      4s        0         3030.1         3011.6      0.5      0.7      2.8
      5s        0         3117.8         3032.9      0.4      0.7      3.1
      6s        0         3122.4         3047.8      0.4      0.7      3.5
      7s        0         3124.7         3058.8      0.4      0.7      4.2
      8s        0         3035.6         3055.9      0.4      0.7      3.7
      9s        0         3049.5         3055.2      0.4      0.6      5.8
     10s        0         3101.3         3059.8      0.4      0.6      5.0

_elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end)
   10.0s        0          30608         3059.8      0.4      0.7      5.8              0 / 0

BenchmarkBlocks	   30608	    326824.5 ns/op

After:

[21:43:130]% ./kv -read-percent=100 -concurrency=1 -duration=10s
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)
      1s        0         3016.9         3016.8      0.5      0.7      3.4
      2s        0         3333.4         3174.8      0.4      0.6      2.4
      3s        0         3298.2         3215.9      0.4      0.6      3.1
      4s        0         3278.9         3231.7      0.4      0.6      2.6
      5s        0         3276.6         3240.6      0.4      0.6      3.5
      6s        0         3259.5         3243.8      0.4      0.7      3.3
      7s        0         3297.6         3251.5      0.4      0.7      2.9
      8s        0         3230.3         3248.8      0.4      0.6      3.9
      9s        0         3113.6         3233.8      0.5      0.7      5.5
     10s        0         3300.2         3240.4      0.4      0.6      3.8

_elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end)
   10.0s        0          32419         3240.5      0.4      0.7      5.5              0 / 0

BenchmarkBlocks	   32419	    308592.1 ns/op

knz · 2017-05-07T20:03:08Z

Your solution albeit functionally correct is ... potentially multiplying the number of allocations by the size of the ASTs.

I still think you can do better. Like this:

do like you did in one of your earlier attempts: have TypeCheck keep the placeholders, and have Eval look the value up from the context
use a callback in FmtParsable, used by distsql to "serialize" expressions, to replace the placeholders by their value datums during the AST to string conversion. This way the distsql processors don't see any placeholder any more.

This saves you all the allocations and many tree traversals.

Reviewed 6 of 10 files at r5, 5 of 5 files at r6, 1 of 1 files at r7, 3 of 3 files at r8.
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

pkg/sql/prepare.go, line 101 at r8 (raw file):

	}

	sz := int64(uintptr(len(name)) + unsafe.Sizeof(*stmt))

This change is incorrect and removing the comment is misleading.
The correct code here should measure the memory usage of the AST that's being retained. I understand you can't or don't want to do this here, but an outrageously large query should show up in the memory monitoring still. At least you could keep the length of the query string as an estimator of the memory usage -- the reasoning being that the ASCII encoding in SQL is not a very large constant factor away from the AST structural representation.

Comments from Reviewable

Previously, placeholder nodes were completely removed during type-checking when a value was available for them and replaced by their value. This mutated the expression tree and made it unsuitable for re-use. Now, the expression tree is "modified" using a visitor, which ensures that the original copy of the expression tree is left unchanged. This new behavior preserves the original parsed expression tree even after evaluation, which sets the stage for caching parse results.

jordanlewis · 2017-05-09T00:13:07Z

As discussed offline, the approach you suggest above is what I tried in the previous iteration of this PR (mod keeping the placeholder values somewhere besides the Placeholder nodes), and I abandoned it due to the additional complexity of properly teaching the index selection and analyze code about the possible existence of Placeholders.

I am planning to do that work in a separate PR, since it's a bit beyond the scope of this one. I've opened #15792 to track the remaining work.

Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.

pkg/sql/prepare.go, line 101 at r8 (raw file):

Previously, knz (kena) wrote…

This change is incorrect and removing the comment is misleading.
The correct code here should measure the memory usage of the AST that's being retained. I understand you can't or don't want to do this here, but an outrageously large query should show up in the memory monitoring still. At least you could keep the length of the query string as an estimator of the memory usage -- the reasoning being that the ASCII encoding in SQL is not a very large constant factor away from the AST structural representation.

Good point. I've restored the old behavior.

Comments from Reviewable

Previously, internal users that wanted to use a parser to parse a sql string containing more than one statement needed to declare a `parser.Parser` and call its `Parse` method. Now, `parser.Parse` is exported which does the exact same thing.

Previously, prepared statements contained only the text of their query, causing them to be re-parsed at execution time. Now, they contain a parsed query. This improves throughput of kv -read-percent=100 -concurrency=1 by about 15% on a local single-node cluster.

knz · 2017-05-09T11:52:27Z

Reviewed 4 of 8 files at r9, 1 of 1 files at r10, 3 of 4 files at r11, 3 of 3 files at r12.
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.

Comments from Reviewable

Some merge skew with cockroachdb#15639 caused the original fix to not really work. Array types are now permitted in `CastTargetToDatumType`, so we need to validate the result of its output in `CREATE VIEW` before passing it onward. This validation code was already there, but unfortunately occurred slightly too late to prevent issues. The validation now occurs soon enough to prevent trouble.

andreimatei · 2017-05-09T19:48:56Z

Some post-merge comments. Sorry for the late review. Nothing major though.

One high-level thing I'd like to understand: does this change make us keep more state in a session than we were before for statements that are not necessarily going to be repeated? I know some drivers prepare all the statements before executing them. Are we going to keep state in that case (and if yes, are we now gonna keep more state)? Or do we somehow get an indication (maybe through the name) about what's really going to be reused and what's prepared for other reasons?

Review status: all files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.

pkg/sql/executor.go, line 404 at r12 (raw file):

// nil if there are no results).
func (e *Executor) Prepare(
	stmts parser.StatementList, session *Session, pinfo parser.PlaceholderTypes,

I spent a while scratching my head about whether this is supposed to be a single statement or it can also be multiple ones until I saw the switch len(stmts) below.
Mind documenting that it needs to be a single statement?
Better yet, I think it'd be better if this took parser.Statement and we'd move error checking to the callers.

pkg/sql/executor.go, line 482 at r12 (raw file):

}

func logIfPanicking(ctx context.Context, sql string) {

nit: comment that this function needs to be directly deferred. Like, it can't be called from an anonymous deferred function.

pkg/sql/executor.go, line 508 at r12 (raw file):

	session *Session, stmt *PreparedStatement, pinfo *parser.PlaceholderInfo,
) StatementResults {
	session.resetForBatch(e)

are you sure this resetForBatch() here is correct? I don't know exactly what the consequences of it are, but the name certainly indicates that it should be done per batch of statements.
It's also called from Executor.Prepare, which I'm guessing is also wrong now that we support PREPARE as a sql statement, and not just as a wire command that always runs outside of batches.

pkg/sql/executor.go, line 509 at r12 (raw file):

) StatementResults {
	session.resetForBatch(e)
	session.phaseTimes[sessionStartBatch] = timeutil.Now()

same for this phaseTimes - is the "startBatch" correct?

pkg/sql/prepare.go, line 31 at r12 (raw file):

// of arguments and results have been determined.
type PreparedStatement struct {
	// Statement is the parsed, prepared SQL statement. It may be nil if the

Mind explaining (or just telling me) what an "empty" prepared statement represents? What does it return?

pkg/sql/prepare.go, line 73 at r12 (raw file):

// in inferring placeholder types.
//
// ps.session.Ctx() is used as the logging context for the prepare operation.

I don't think "logging context" is a phrase we should be using. Contexts help with more than just logging.

Feel free to ignore this, but I don't think this comment is necessary. I suspect there's no real need for the PreparedStatements type to hold a reference to the session; it could probably be passed in everywhere. And then you probably wouldn't have felt the need to write it, as it's more common knowledge that sessions have a context (or at least something more magic that provides a context) for all their operations. Or if we keep the session ptr, I think a comment on the PreparedStatements type saying that an instance is a part of a Session would be better,

pkg/sql/prepare.go, line 74 at r12 (raw file):

//
// ps.session.Ctx() is used as the logging context for the prepare operation.
func (ps PreparedStatements) NewFromString(

Is this new method (which also comes with a duplicate comment) really necessary? Can't the caller do the parsing and call the other one?

pkg/sql/prepare.go, line 95 at r12 (raw file):

	name string,
	stmts parser.StatementList,
	queryLen int,

the queryLen argument seems very odd to me. We moved to a nice structured type for the statement (really, StatementList, but as I was saying above I think that's a mistake), but we pass this length separately. And also the query string length is pretty silly IMHO for the memory mgmt purposes for which it's used. I suggest adding a Size() method to a parser.Statement which takes into account more than just the query length, and getting rid of this argument.

pkg/sql/parser/placeholders.go, line 168 at r9 (raw file):

	switch t := expr.(type) {
	case *Placeholder:
		if e, ok := v.placeholders.Value(t.Name); ok {

what if a value is not found? Should that be an error?
This probably deserves to be documented on the placeholderVisitors type definition.

pkg/sql/parser/placeholders.go, line 169 at r9 (raw file):

	case *Placeholder:
		if e, ok := v.placeholders.Value(t.Name); ok {
			return true, e

do you need to recurse (true) here?

pkg/sql/parser/placeholders.go, line 178 at r9 (raw file):

// replacePlaceholders replaces all placeholders in the input expression with
// their supplied values in the planner's Placeholders map.

The comment says something about a planner but I don't see one here.

pkg/sql/parser/placeholders.go, line 180 at r9 (raw file):

// their supplied values in the planner's Placeholders map.
func replacePlaceholders(expr Expr, ctx *SemaContext) Expr {
	if ctx == nil {

if we really need to accept nil ctx, please document what that means.

pkg/sql/parser/placeholders.go, line 183 at r9 (raw file):

		return expr
	}
	v := &placeholdersVisitor{}

why do you need two lines to initialize a placeholderVisitor?

pkg/sql/parser/type_check.go, line 736 at r9 (raw file):

// TypeCheck implements the Expr interface.
func (expr *Placeholder) TypeCheck(ctx *SemaContext, desired Type) (TypedExpr, error) {
	// Perform placeholder typing. This only happens during Prepare, when there

this comment is unclear to me. What only happens during prepare? Is this function only called during prepare?

Perhaps say that there shouldn't be any Placeholders after prepare. Could we also assert that through the SemaContext somehow?

pkg/sql/pgwire/v3.go, line 826 at r12 (raw file):

}

func (c *v3Conn) executeStatementParsed(

the name says "parsed" but this seems to be about "prepared" statements

pkg/sql/pgwire/v3.go, line 834 at r12 (raw file):

) error {
	tracing.AnnotateTrace()
	// Note: sql.Executor gets its Context from c.session.context, which

sql.Executor is a singleton; it doesn't "get" a context.

I'd remove this comment. Knowledge about how a session's ctx is used should live elsewhere I think.

Comments from Reviewable

jordanlewis · 2017-05-09T21:36:16Z

Thanks for the comments! I'll follow up on these.

At a high level, no, we don't keep more state in a session than before unless someone is using named prepared statements. The pgwire "extended protocol" does indeed prepare every statement before executing it, but it's prepared as the "default statement", which is the (single) unnamed prepared statement that is present in every session.

I think that the memory overhead here is pretty much equivalent to what it was before. One exception I can think of is if you found a way to create a statement that is much larger in its parsed form than in its text form. Then, if you were to prepare and execute this statement via the unnamed prepared statement and then leave your session open but idle, the server would keep this structure in memory pointlessly.

Review status: all files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.

pkg/sql/executor.go, line 404 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I spent a while scratching my head about whether this is supposed to be a single statement or it can also be multiple ones until I saw the switch len(stmts) below.
Mind documenting that it needs to be a single statement?
Better yet, I think it'd be better if this took parser.Statement and we'd move error checking to the callers.

Added a comment.

pkg/sql/executor.go, line 482 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: comment that this function needs to be directly deferred. Like, it can't be called from an anonymous deferred function.

Done.

pkg/sql/executor.go, line 508 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

are you sure this resetForBatch() here is correct? I don't know exactly what the consequences of it are, but the name certainly indicates that it should be done per batch of statements.
It's also called from Executor.Prepare, which I'm guessing is also wrong now that we support PREPARE as a sql statement, and not just as a wire command that always runs outside of batches.

Yeah this is incorrect. I think it's harmless during PREPARE though.

pkg/sql/executor.go, line 509 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

same for this phaseTimes - is the "startBatch" correct?

Done.

pkg/sql/prepare.go, line 31 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Mind explaining (or just telling me) what an "empty" prepared statement represents? What does it return?

It returns no data. This is valid behavior and happens in the wild, and we in fact have tests for it in pgwire_test.go.

pkg/sql/prepare.go, line 73 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I don't think "logging context" is a phrase we should be using. Contexts help with more than just logging.

Feel free to ignore this, but I don't think this comment is necessary. I suspect there's no real need for the PreparedStatements type to hold a reference to the session; it could probably be passed in everywhere. And then you probably wouldn't have felt the need to write it, as it's more common knowledge that sessions have a context (or at least something more magic that provides a context) for all their operations. Or if we keep the session ptr, I think a comment on the PreparedStatements type saying that an instance is a part of a Session would be better,

This was just copy pasta from PreparedStatements.New.

pkg/sql/prepare.go, line 74 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Is this new method (which also comes with a duplicate comment) really necessary? Can't the caller do the parsing and call the other one?

I think this is cleaner.

pkg/sql/prepare.go, line 95 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

the queryLen argument seems very odd to me. We moved to a nice structured type for the statement (really, StatementList, but as I was saying above I think that's a mistake), but we pass this length separately. And also the query string length is pretty silly IMHO for the memory mgmt purposes for which it's used. I suggest adding a Size() method to a parser.Statement which takes into account more than just the query length, and getting rid of this argument.

I agree that it's odd. I agree that Statement should have a Size - in the mean time @knz suggested using query string length as a proxy.

pkg/sql/parser/placeholders.go, line 168 at r9 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

what if a value is not found? Should that be an error?
This probably deserves to be documented on the placeholderVisitors type definition.

No - if we're in the prepare phase, we expect to see placeholders without values.

pkg/sql/parser/placeholders.go, line 169 at r9 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

do you need to recurse (true) here?

Placeholders can be expressions, but they can't contain other placeholders. You're right.

pkg/sql/parser/placeholders.go, line 178 at r9 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

The comment says something about a planner but I don't see one here.

Yep, changed to say SemaContext.

pkg/sql/parser/placeholders.go, line 180 at r9 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

if we really need to accept nil ctx, please document what that means.

People pass nil SemaContexts to TypeCheck in tests a lot.

pkg/sql/parser/placeholders.go, line 183 at r9 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

why do you need two lines to initialize a placeholderVisitor?

I was copying how it's done in select_name_resolution.go without thinking about it:

	v := &p.nameResolutionVisitor
	*v = nameResolutionVisitor{
		err:                nil,
		sources:            sources,
		colOffsets:         make([]int, len(sources)),
		iVarHelper:         ivarHelper,
		searchPath:         p.session.SearchPath,
		foundDependentVars: false,
	}

Is there a reason it's done that way there?

pkg/sql/parser/type_check.go, line 736 at r9 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

this comment is unclear to me. What only happens during prepare? Is this function only called during prepare?

Perhaps say that there shouldn't be any Placeholders after prepare. Could we also assert that through the SemaContext somehow?

Done.

pkg/sql/pgwire/v3.go, line 826 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

the name says "parsed" but this seems to be about "prepared" statements

I inlined this function as it's just called in one place.

pkg/sql/pgwire/v3.go, line 834 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

sql.Executor is a singleton; it doesn't "get" a context.

I'd remove this comment. Knowledge about how a session's ctx is used should live elsewhere I think.

Done.

Comments from Reviewable

andreimatei · 2017-05-09T22:36:38Z

I have trouble understanding from the code where the named vs unnamed distinction comes into play. Does unnamed mean name == ""? If so, maybe you can find a place to document what this means and how these unnamed statements are used by the protocol. I know this is not new code, but you seem to understand it :)

If that's true and a session has one unnamed prepared statement, consider storing the unnamed one separately from the named ones to make the code path that uses it more distinct from the paths that use cached ones, and prevent questions from the likes of me. The caching behaviour could also change I guess (you don't need to pre-parse the unnamed one if it's only executed once), but that I don't care about. I don't know if this makes sense.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: prepared statements cache parsed query #15639

sql: prepared statements cache parsed query #15639

jordanlewis commented May 3, 2017

cockroach-teamcity commented May 3, 2017

petermattis commented May 3, 2017

tamird commented May 3, 2017

knz commented May 3, 2017

knz commented May 3, 2017

jordanlewis commented May 3, 2017

tamird commented May 3, 2017

jordanlewis commented May 3, 2017

knz commented May 3, 2017

jordanlewis commented May 5, 2017

jordanlewis commented May 5, 2017

andreimatei commented May 5, 2017

jordanlewis commented May 6, 2017

petermattis commented May 6, 2017

RaduBerinde commented May 6, 2017

jordanlewis commented May 6, 2017

RaduBerinde commented May 6, 2017

jordanlewis commented May 7, 2017

knz commented May 7, 2017

jordanlewis commented May 9, 2017

knz commented May 9, 2017

andreimatei commented May 9, 2017

jordanlewis commented May 9, 2017

andreimatei commented May 9, 2017

jordanlewis commented May 10, 2017

sql: prepared statements cache parsed query #15639

sql: prepared statements cache parsed query #15639

Conversation

jordanlewis commented May 3, 2017

cockroach-teamcity commented May 3, 2017

petermattis commented May 3, 2017

tamird commented May 3, 2017

knz commented May 3, 2017

knz commented May 3, 2017

jordanlewis commented May 3, 2017

tamird commented May 3, 2017

jordanlewis commented May 3, 2017

knz commented May 3, 2017

jordanlewis commented May 5, 2017

jordanlewis commented May 5, 2017

andreimatei commented May 5, 2017

jordanlewis commented May 6, 2017

petermattis commented May 6, 2017

RaduBerinde commented May 6, 2017

jordanlewis commented May 6, 2017

RaduBerinde commented May 6, 2017

jordanlewis commented May 7, 2017

knz commented May 7, 2017

jordanlewis commented May 9, 2017

knz commented May 9, 2017

andreimatei commented May 9, 2017

jordanlewis commented May 9, 2017

andreimatei commented May 9, 2017

jordanlewis commented May 10, 2017