Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: panic: indexed var linked to different container #16388

Closed
tbg opened this issue Jun 7, 2017 · 25 comments
Closed

sql: panic: indexed var linked to different container #16388

tbg opened this issue Jun 7, 2017 · 25 comments
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting
Milestone

Comments

@tbg
Copy link
Member

tbg commented Jun 7, 2017

report by @HeikoOnnebrink on gitter.

CockroachDB node starting at 2017-06-06 12:35:42.02954998 +0000 UTC
build:      CCL 2cc11e7 @ 2017/05/31 03:11:20 (go1.8.1)
admin:      http://api-4.openstacklocal:8080
sql:        postgresql://root@10.97.176.28:26257?sslmode=disable
logs:       /cockroach/cockroach-data/logs
store[0]:   path=/cockroach/cockroach-data
status:     restarted pre-existing node
clusterID:  68f17c2d-dcca-49f3-b0dd-4ec026196a12
nodeID:     4

today I was prototyping some SQL as part of some REST API.
The idea is to have some table that stores requests that are inserted from different sources where RequestKey is the Primary key and is of type SERIAL to ensure global sorting.
CREATE TABLE IF NOT EXISTS DbRequests (
RequestKey serial PRIMARY KEY,
RequestType string NOT NULL,
RequestLock bool NOT NULL default false,
ID string NOT NULL,
etc..
There is a set of consumers (worker processes) that grab requests and process them in some sorted fashion by quering the dbRequestTable
(hope in the future when NOTIFY made its way to Roach I can use smarter way than to query the table in some interval..
In order to feed the workers I use some SQL that, on each execution, returns the oldest request that is not yet processed.
The worker will try to lock the request (Update DbRequests set RequestLock=TRUE where RequestKey = and RequestLock=false)
As there can be several requests for each resource (I use here some ID column that saves the UUID of each resource) the query will supress RequestKeys where at least one record per ID has RequestLock = true
In this way workers process requests fully parallel for different IDs but sequentially for requests belonging to same ID.
The related SQL that does the job is:
SELECT min(requestkey) AS requestkey FROM dbrequests WHERE dcname = $1 GROUP BY id HAVING count (case when requestlock = true then 1 end) = 0 ORDER BY 1 LIMIT 1;
My environment is a 9 node cluster spanning 3 DCs with 3 nodes per DC and all replication settings left default.
I run some tests by in injesting some hundred requests at DC 1 and let workers poll the DB in 2 second interval from DC A, B & C
Basically all works fine as designed .. but after some time remote nodes in DC B and C fail with PANIC in log.
I uploaded log cockroachnodeFailSql.log to https://bitbucket.org/honnebrink/public-file-share/src

HeikoOnnebrink @HeikoOnnebrink 07:14
I repeated the tests and could each time reproduce the problem.. one test run got 2 FATALs in DC B and 1 FATAL in DC C.. second test run failed 3 times in DC B only.. Test was executed with 4444 records..

E170607 06:02:39.961680 3008378 util/log/crash_reporting.go:81  [n4] a panic has occurred!
E170607 06:02:40.632301 3008378 util/log/crash_reporting.go:166  [n4] Reported as error 95496809bee840858d84d16595d14950
panic: indexed var linked to different container (*sql.renderNode) &{planner:0xc421a4ac80 source:{info:0xc42416bc80 plan:0xc42305c000} sourceInfo:[0xc42416bc80] ivarHelper:{vars:[{Idx:0 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:2 container:0xc4236787e0} {Idx:3 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc4236787e0} render:[0xc422377b80 0xc42259f4f0 0xc422377bc8] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false} {Name:CASE WHEN requestlock = true THEN 1 END Typ:{} Hidden:false Omitted:false} {Name:id Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc4206e5300 source:{info:0xc4270cd140 plan:0xc421a1da70} sourceInfo:[0xc4270cd140] ivarHelper:{vars:[{Idx:0 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:3 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc423c8a900} render:[0xc421d28480] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}} [recovered]
	panic: SELECT min(requestkey) AS requestkey FROM dbrequests WHERE dcname = $1 GROUP BY id HAVING count (case when requestlock = true then 1 end) = 0 ORDER BY 1 LIMIT 1: indexed var linked to different container (*sql.renderNode) &{planner:0xc421a4ac80 source:{info:0xc42416bc80 plan:0xc42305c000} sourceInfo:[0xc42416bc80] ivarHelper:{vars:[{Idx:0 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:2 container:0xc4236787e0} {Idx:3 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc4236787e0} render:[0xc422377b80 0xc42259f4f0 0xc422377bc8] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false} {Name:CASE WHEN requestlock = true THEN 1 END Typ:{} Hidden:false Omitted:false} {Name:id Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc4206e5300 source:{info:0xc4270cd140 plan:0xc421a1da70} sourceInfo:[0xc4270cd140] ivarHelper:{vars:[{Idx:0 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:3 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc423c8a900} render:[0xc421d28480] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}} [recovered]
	panic: SELECT min(requestkey) AS requestkey FROM dbrequests WHERE dcname = $1 GROUP BY id HAVING count (case when requestlock = true then 1 end) = 0 ORDER BY 1 LIMIT 1: indexed var linked to different container (*sql.renderNode) &{planner:0xc421a4ac80 source:{info:0xc42416bc80 plan:0xc42305c000} sourceInfo:[0xc42416bc80] ivarHelper:{vars:[{Idx:0 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:2 container:0xc4236787e0} {Idx:3 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc4236787e0} render:[0xc422377b80 0xc42259f4f0 0xc422377bc8] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false} {Name:CASE WHEN requestlock = true THEN 1 END Typ:{} Hidden:false Omitted:false} {Name:id Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc4206e5300 source:{info:0xc4270cd140 plan:0xc421a1da70} sourceInfo:[0xc4270cd140] ivarHelper:{vars:[{Idx:0 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:3 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc423c8a900} render:[0xc421d28480] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}} [recovered]
	panic: SELECT min(requestkey) AS requestkey FROM dbrequests WHERE dcname = $1 GROUP BY id HAVING count (case when requestlock = true then 1 end) = 0 ORDER BY 1 LIMIT 1: indexed var linked to different container (*sql.renderNode) &{planner:0xc421a4ac80 source:{info:0xc42416bc80 plan:0xc42305c000} sourceInfo:[0xc42416bc80] ivarHelper:{vars:[{Idx:0 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:2 container:0xc4236787e0} {Idx:3 container:0xc4236787e0} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc4236787e0} render:[0xc422377b80 0xc42259f4f0 0xc422377bc8] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false} {Name:CASE WHEN requestlock = true THEN 1 END Typ:{} Hidden:false Omitted:false} {Name:id Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc4206e5300 source:{info:0xc4270cd140 plan:0xc421a1da70} sourceInfo:[0xc4270cd140] ivarHelper:{vars:[{Idx:0 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:3 container:0xc423c8a900} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>} {Idx:0 container:<nil>}] container:0xc423c8a900} render:[0xc421d28480] columns:[{Name:requestkey Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}

goroutine 3008378 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc420637680, 0x7fd0fac4ae68, 0xc42045ac00)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:200 +0xb1
panic(0x18fea00, 0xc423695310)
	/usr/local/go/src/runtime/panic.go:489 +0x2cf
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*v3Conn).serve.func1(0xc420d86000, 0x7fd0fac4ae68, 0xc422822f30)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/v3.go:353 +0x74
panic(0x18fea00, 0xc423695310)
	/usr/local/go/src/runtime/panic.go:489 +0x2cf
github.com/cockroachdb/cockroach/pkg/sql.(*Executor).ExecuteStatements.func1(0xc421b9a29e, 0xa0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/executor.go:499 +0x131
panic(0x18a86c0, 0xc4236952f0)
	/usr/local/go/src/runtime/panic.go:489 +0x2cf
github.com/cockroachdb/cockroach/pkg/sql/parser.(*IndexedVarHelper).AssertSameContainer(0xc4206e6ab8, 0xc422377430)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/indexed_vars.go:139 +0x18c
github.com/cockroachdb/cockroach/pkg/sql.(*nameResolutionVisitor).VisitPre(0xc4206e6a78, 0x2786120, 0xc422377430, 0x2769a8c, 0x16, 0x2769a8c)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/select_name_resolution.go:74 +0x3c0
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x27764e0, 0xc4206e6a78, 0x2786120, 0xc422377430, 0x2785b60, 0x2769a8c, 0xc4276f5200)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:552 +0x4c
github.com/cockroachdb/cockroach/pkg/sql/parser.(*ComparisonExpr).Walk(0xc423653580, 0x27764e0, 0xc4206e6a78, 0xc42305a801, 0x2785ae0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:177 +0x56
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x27764e0, 0xc4206e6a78, 0x2785ae0, 0xc423653580, 0x2785ce0, 0xc42305a8e0, 0x2786100)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:555 +0x15a
github.com/cockroachdb/cockroach/pkg/sql/parser.(*CaseExpr).Walk(0xc42259f400, 0x27764e0, 0xc4206e6a78, 0xc42259f401, 0x2785960)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:109 +0xab
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x27764e0, 0xc4206e6a78, 0x2785960, 0xc42259f400, 0x0, 0x0, 0x2785a00)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:555 +0x15a
github.com/cockroachdb/cockroach/pkg/sql/parser.(*FuncExpr).Walk(0xc426da4840, 0x27764e0, 0xc4206e6a78, 0xc42305a901, 0x27860a0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:224 +0xa3
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x27764e0, 0xc4206e6a78, 0x27860a0, 0xc426da4840, 0x2785ce0, 0x8, 0x1a87500)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:555 +0x15a
github.com/cockroachdb/cockroach/pkg/sql/parser.(*ComparisonExpr).Walk(0xc423653600, 0x27764e0, 0xc4206e6a78, 0x1, 0x2785ae0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:177 +0x56
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x27764e0, 0xc4206e6a78, 0x2785ae0, 0xc423653600, 0x1, 0x1, 0x2786000)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:555 +0x15a
github.com/cockroachdb/cockroach/pkg/sql.(*planner).resolveNames(0xc4206e5300, 0x2785ae0, 0xc423653600, 0xc4203a4380, 0x1, 0x1, 0xc4258c5400, 0x1a, 0x1a, 0x277f1e0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/select_name_resolution.go:196 +0x205
github.com/cockroachdb/cockroach/pkg/sql.(*planner).analyzeExpr(0xc4206e5300, 0x7fd0fac4ae68, 0xc42416d2c0, 0x2785ae0, 0xc423653600, 0xc4203a4380, 0x1, 0x1, 0xc4258c5400, 0x1a, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/analyze.go:1653 +0x28d
github.com/cockroachdb/cockroach/pkg/sql.(*planner).groupBy(0xc4206e5300, 0x7fd0fac4ae68, 0xc42416d2c0, 0xc423653180, 0xc423c8a900, 0x0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/group.go:107 +0x1c72
github.com/cockroachdb/cockroach/pkg/sql.(*planner).SelectClause(0xc4206e5300, 0x7fd0fac4ae68, 0xc42416d2c0, 0xc423653180, 0xc42000f020, 0x1, 0x1, 0xc423058aa0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/render.go:281 +0x424
github.com/cockroachdb/cockroach/pkg/sql.(*planner).Select(0xc4206e5300, 0x7fd0fac4ae68, 0xc42416d2c0, 0xc42416d290, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
@tbg tbg added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 7, 2017
@knz knz added the S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting label Jun 7, 2017
@knz knz added this to the 1.1 milestone Jun 7, 2017
@knz
Copy link
Contributor

knz commented Jun 8, 2017

Dumping my thoughts after I looked at this with jordan:

  • the stack trace is very clear that the crash happens during name resolution of the HAVING expression, while building the query plan; this name resolution happens immediately after parsing, that is before anything like expression normalization/simplication/type checking or any further processing of the HAVING expression by groupNode
  • at that point thus, the only nodes that the name resolution recursion can encounter are those populated by the parser
  • furthermore, the stack trace indicates the error occurs during name resolution of the left branch of the comparion inside CASE inside a function call, that is the expression formed by the name "requestlock"
  • the parser cannot instantiate an IndexedVar for the query at hand when it encounters the token requestlock - for this token an UnresolvedName would be instantiated instead

So what we have at hand is an impossible stack trace!

Which in turn indicates something went very wrong in the Go runtime.

@knz
Copy link
Contributor

knz commented Jun 8, 2017

Also another thing that is very suspicious is that the error message tells us (AssertSameContainer) the IndexedVar was migrated from one renderNode to another (the error would only occur if the addresses of the two renderNodes are different in memory, thus different objects). Although we have logic that migrates expressions from one node to another,

  1. we never do that for HAVING
  2. there are no two separate renderNodes generated for that query in the first place -- it is one thing to have vars incorrectly initialized due to a bug, but the incorrect state necessarily has to come from somewhere nearby, this is here simply impossible

This suggests to me the rogue IndexedVar has "happened" into this AST by means of reusing a stale object from somewhere else in the program state.

@RaduBerinde
Copy link
Member

There are two renderNodes, the plan should look like this:

limitNode
   |
sortNode
   |
renderNode   (this projects away the `count(..)`)
   |
filterNode  (this filters out things based on `count(..)`)
   |
groupNode  (this emits the `min(..)` and the `count(..)`)
   |
renderNode
   |
scanNode

@RaduBerinde
Copy link
Member

I created the table and ran the query (without the WHERE) locally and I don't see the crash.

+-------+--------+-------------+----------------------------------------------------------------+-----------------------------------------------------------------------+-------------+
| Level |  Type  |    Field    |                          Description                           |                                Columns                                |  Ordering   |
+-------+--------+-------------+----------------------------------------------------------------+-----------------------------------------------------------------------+-------------+
|     0 | limit  |             |                                                                | (requestkey)                                                          | +requestkey |
|     0 |        | count       |                                                              1 |                                                                       |             |
|     1 | sort   |             |                                                                | (requestkey)                                                          | +requestkey |
|     1 |        | order       | +requestkey                                                    |                                                                       |             |
|     1 |        | strategy    | top 1                                                          |                                                                       |             |
|     2 | render |             |                                                                | (requestkey)                                                          |             |
|     2 |        | render 0    | min(requestkey)                                                |                                                                       |             |
|     3 | filter |             |                                                                | ("count(CASE WHEN requestlock = true THEN 1 END)", "min(requestkey)") |             |
|     3 |        | filter      | count(CASE WHEN requestlock = true THEN 1 END) = 0             |                                                                       |             |
|     4 | group  |             |                                                                | ("count(CASE WHEN requestlock = true THEN 1 END)", "min(requestkey)") |             |
|     4 |        | aggregate 0 | count(CASE WHEN test.dbrequests.requestlock = true THEN 1 END) |                                                                       |             |
|     4 |        | aggregate 1 | min(test.dbrequests.requestkey)                                |                                                                       |             |
|     5 | render |             |                                                                | (id, "CASE WHEN requestlock = true THEN 1 END", requestkey)           |             |
|     5 |        | render 0    | test.dbrequests.id                                             |                                                                       |             |
|     5 |        | render 1    | CASE WHEN test.dbrequests.requestlock = true THEN 1 END        |                                                                       |             |
|     5 |        | render 2    | test.dbrequests.requestkey                                     |                                                                       |             |
|     6 | scan   |             |                                                                | (requestkey, requesttype[omitted], requestlock, id)                   |             |
|     6 |        | table       | dbrequests@primary                                             |                                                                       |             |
|     6 |        | spans       | ALL                                                            |                                                                       |             |
+-------+--------+-------------+----------------------------------------------------------------+-----------------------------------------------------------------------+-------------+

@RaduBerinde
Copy link
Member

Ah, my bad, this is an 1.0.x version, this doesn't have the recent rewrite.

@RaduBerinde
Copy link
Member

RaduBerinde commented Jun 8, 2017

Perhaps this has to do with the fact that it's a prepared statement (notice it uses a placeholder) - maybe one of the renderNodes is the one that was created during prepare?
(note that the two renderNodes have different planners!)

@jordanlewis
Copy link
Member

Indeed!

root@:26257/test> CREATE TABLE t (a int);
CREATE TABLE
root@:26257/test> PREPARE x AS SELECT min(1) FROM t WHERE a = $1 GROUP BY a HAVING count(a) = 0;
OK
root@:26257/test> EXECUTE x (1);
driver: bad connection
connection lost; opening new connection and resetting session parameters...
root@:26257 ?>

Logs:

panic: indexed var linked to different container (*sql.renderNode) &{planner:0xc420761980 source:{info:0xc420349cc0 plan:0xc4206d77a0} sourceInfo:[0xc420349cc0] ivarHelper:{vars:[{Idx:0 container:0xc420275320} {Idx:0 container:<nil>}] container:0xc420275320} render:[0xc420b28240 0xc4204146b0] columns:[{Name:a Typ:{} Hidden:false Omitted:false} {Name:1 Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc420f5e1c8 source:{info:0xc4208b6840 plan:0xc420736630} sourceInfo:[0xc4208b6840] ivarHelper:{vars:[{Idx:0 container:0xc420ec4240} {Idx:0 container:<nil>}] container:0xc420ec4240} render:[0xc4206bf800] columns:[{Name:min(1) Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}} [recovered]
        panic: EXECUTE x (1);: indexed var linked to different container (*sql.renderNode) &{planner:0xc420761980 source:{info:0xc420349cc0 plan:0xc4206d77a0} sourceInfo:[0xc420349cc0] ivarHelper:{vars:[{Idx:0 container:0xc420275320} {Idx:0 container:<nil>}] container:0xc420275320} render:[0xc420b28240 0xc4204146b0] columns:[{Name:a Typ:{} Hidden:false Omitted:false} {Name:1 Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc420f5e1c8 source:{info:0xc4208b6840 plan:0xc420736630} sourceInfo:[0xc4208b6840] ivarHelper:{vars:[{Idx:0 container:0xc420ec4240} {Idx:0 container:<nil>}] container:0xc420ec4240} render:[0xc4206bf800] columns:[{Name:min(1) Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}} [recovered]
        panic: EXECUTE x (1);: indexed var linked to different container (*sql.renderNode) &{planner:0xc420761980 source:{info:0xc420349cc0 plan:0xc4206d77a0} sourceInfo:[0xc420349cc0] ivarHelper:{vars:[{Idx:0 container:0xc420275320} {Idx:0 container:<nil>}] container:0xc420275320} render:[0xc420b28240 0xc4204146b0] columns:[{Name:a Typ:{} Hidden:false Omitted:false} {Name:1 Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc420f5e1c8 source:{info:0xc4208b6840 plan:0xc420736630} sourceInfo:[0xc4208b6840] ivarHelper:{vars:[{Idx:0 container:0xc420ec4240} {Idx:0 container:<nil>}] container:0xc420ec4240} render:[0xc4206bf800] columns:[{Name:min(1) Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}} [recovered]
        panic: EXECUTE x (1);: indexed var linked to different container (*sql.renderNode) &{planner:0xc420761980 source:{info:0xc420349cc0 plan:0xc4206d77a0} sourceInfo:[0xc420349cc0] ivarHelper:{vars:[{Idx:0 container:0xc420275320} {Idx:0 container:<nil>}] container:0xc420275320} render:[0xc420b28240 0xc4204146b0] columns:[{Name:a Typ:{} Hidden:false Omitted:false} {Name:1 Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}, expected (*sql.renderNode) &{planner:0xc420f5e1c8 source:{info:0xc4208b6840 plan:0xc420736630} sourceInfo:[0xc4208b6840] ivarHelper:{vars:[{Idx:0 container:0xc420ec4240} {Idx:0 container:<nil>}] container:0xc420ec4240} render:[0xc4206bf800] columns:[{Name:min(1) Typ:{} Hidden:false Omitted:false}] isStar:false numOriginalCols:1 ordering:{exactMatchCols:map[] ordering:[] unique:false} explain:0 curSourceRow:[] row:[] noCopy:{}}

goroutine 188 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc4206a3c20, 0x7e861b0, 0xc4202f8030)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:200 +0xb1
panic(0x53ef0a0, 0xc420238f30)
        /usr/local/Cellar/go/1.8.1/libexec/src/runtime/panic.go:489 +0x2cf
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*v3Conn).serve.func1(0xc420d05600, 0x7e861b0, 0xc420adcd20)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/v3.go:359 +0x74
panic(0x53ef0a0, 0xc420238f30)
        /usr/local/Cellar/go/1.8.1/libexec/src/runtime/panic.go:489 +0x2cf
github.com/cockroachdb/cockroach/pkg/sql.logIfPanicking(0x7e863d0, 0xc420f15e40, 0xc420f610fe, 0xe)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/executor.go:517 +0x1e8
panic(0x53979e0, 0xc420238e60)
        /usr/local/Cellar/go/1.8.1/libexec/src/runtime/panic.go:489 +0x2cf
github.com/cockroachdb/cockroach/pkg/sql/parser.(*IndexedVarHelper).AssertSameContainer(0xc420f5f998, 0xc420b28240)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/parser/indexed_vars.go:139 +0x18c
github.com/cockroachdb/cockroach/pkg/sql.(*nameResolutionVisitor).VisitPre(0xc420f5f958, 0x6080de0, 0xc420b28240, 0xc420b28240, 0xc420b163d8, 0x4010abd)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/select_name_resolution.go:78 +0x939
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x60707a0, 0xc420f5f958, 0x6080de0, 0xc420b28240, 0x0, 0x0, 0x0)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:555 +0x4c
github.com/cockroachdb/cockroach/pkg/sql/parser.(*FuncExpr).Walk(0xc42016b200, 0x60707a0, 0xc420f5f958, 0xc420414701, 0x6080d60)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:224 +0xa3
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x60707a0, 0xc420f5f958, 0x6080d60, 0xc42016b200, 0x6080960, 0x8, 0xc4201f2f00)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:558 +0x15a
github.com/cockroachdb/cockroach/pkg/sql/parser.(*ComparisonExpr).Walk(0xc42045be80, 0x60707a0, 0xc420f5f958, 0x1, 0x6080760)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:177 +0x56
github.com/cockroachdb/cockroach/pkg/sql/parser.WalkExpr(0x60707a0, 0xc420f5f958, 0x6080760, 0xc42045be80, 0x1, 0x1, 0x6080d00)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/parser/walk.go:558 +0x15a
github.com/cockroachdb/cockroach/pkg/sql.(*planner).resolveNames(0xc420f5e1c8, 0x6080760, 0xc42045be80, 0xc4203f8080, 0x1, 0x1, 0xc4211b1140, 0x2, 0x2, 0x60798a0, ...)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/select_name_resolution.go:200 +0x205
github.com/cockroachdb/cockroach/pkg/sql.(*planner).analyzeExpr(0xc420f5e1c8, 0x7e861b0, 0xc4211b0750, 0x6080760, 0xc42045be80, 0xc4203f8080, 0x1, 0x1, 0xc4211b1140, 0x2, ...)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/analyze.go:1654 +0x28d
github.com/cockroachdb/cockroach/pkg/sql.(*planner).groupBy(0xc420f5e1c8, 0x7e861b0, 0xc4211b0750, 0xc4204147e0, 0xc420ec4240, 0x0, 0x0, 0x0, 0x0, 0x0)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/group.go:138 +0x1c38
github.com/cockroachdb/cockroach/pkg/sql.(*planner).SelectClause(0xc420f5e1c8, 0x7e861b0, 0xc4211b0750, 0xc4204147e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/render.go:281 +0x424
github.com/cockroachdb/cockroach/pkg/sql.(*planner).Select(0xc420f5e1c8, 0x7e861b0, 0xc4211b0750, 0xc42125fc80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/render.go:196 +0x50f
github.com/cockroachdb/cockroach/pkg/sql.(*planner).newPlan(0xc420f5e1c8, 0x7e861b0, 0xc4211b0750, 0x60811e0, 0xc42125fc80, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/plan.go:351 +0x29ae
github.com/cockroachdb/cockroach/pkg/sql.(*planner).makePlan(0xc420f5e1c8, 0x7e861b0, 0xc4211b0750, 0x60811e0, 0xc42125fc80, 0xc420b287e0, 0x1, 0x1, 0xc4208b6740, 0x5939caca, ...)
        /Users/jordan/repo/src/github.com/cockroachdb/cockroach/pkg/sql/plan.go:219 +0x8f
github.com/cockroachdb/cockroach/pkg/sql.(*Executor).execStmt(0xc420122000, 0x60811e0, 0xc42125fc80, 0xc420b287e0, 0x1, 0x1, 0xc4208b6740, 0xc420f5e1c8, 0x0, 0x0, ...)

@jordanlewis
Copy link
Member

Although the above was produced on a master build - might want to repro on a 1.0 branch as well to ensure it doesn't have to do with prepared statement caching.

@jordanlewis
Copy link
Member

I can't reproduce this on the build 2cc11e7 that the user reported the crash with. This should be fixed regardless.

@HeikoOnnebrink
Copy link

Just to make clear... the statement generally executes fine ... just during stress test with 3 concurrent workers in a 9 node cluster sometimes (1 out of 1000) execution fails (see problem description on top) up to now I see problem only on nodes where data was not inserted , will do some more tests to confirm the same ..

@jordanlewis
Copy link
Member

Thanks @HeikoOnnebrink.

The related problem on master is caused by the fact that prepare mutates the AST when producing the output plan node. This is troublesome - the prepared statement caching I introduced in #15639 assumes that the AST doesn't get modified in place by prepare, or at least not in such a way that causes it to become invalid when prepare is called on it again.

@knz
Copy link
Contributor

knz commented Jun 9, 2017

IndexedVars are modified in-place I believe (BindIfUnbound).

@jordanlewis
Copy link
Member

But in normal operation, IndexedVars should not be introduced into the parsed AST. The parsed AST should never be mutated. IndexedVars will get added to expressions, but those expressions should always be returned as copies of the original AST. Normally, we do this well.

However, there was a bug:

n.Having.Expr = typedHaving

This line mutates the original parsed AST, causing reuse of that AST to fail. This line is not necessary and should be removed. I will be submitting a PR to do this shortly.

@RaduBerinde
Copy link
Member

Nice find!

@knz
Copy link
Contributor

knz commented Jun 9, 2017

#16434 is a good fix for the problem identified in #16388 (comment) but not for this issue.

@jordanlewis
Copy link
Member

Hmm, yeah I certainly can't prove that this issue is fully solved by #16434. But the two issues are suspiciously similar, so I was planning on optimistically closing this. I wonder if there is some place where we prepare an AST more than once before the prepare caching was introduced.

@knz
Copy link
Contributor

knz commented Jun 9, 2017

not the same AST, no

@RaduBerinde
Copy link
Member

Another case when mutation of the parsed AST is a problem is when retrying statements on the server side.

@RaduBerinde
Copy link
Member

(You could instrument execParsed to retry all queries a couple of times and print out the tree every time)

@knz
Copy link
Contributor

knz commented Jun 12, 2017

From Gitter

@HeikoOnnebrink I restarted all 9 nodes with --vmodule=executor=2. Inserted 4000 rows into dbreqeusts table by connecting to node 10.96.214.113 (DC A).
I started 3 workers against node 10.96.214.113 (DC A), 10.97.176.28 (DC B) & 10.98.215.15 (DC C).
Again only the worker that connected to "remote not DC A" node where data was not inserted paniced.
Logs can be found at https://bitbucket.org/honnebrink/public-file-share/src in files panic-node-1--vmodule.log panic-node-2--vmodule.log
I leave the cluster running with the current log settings... just let me know whenever you need more test runs ..

@HeikoOnnebrink
Copy link

I created a custom build from release-1.0
This branch includes the "pkg/sql: group by mutation bugfix #16434"
Retested the problem from #16388 with 3 test runs and issue seems to be fixed with my custom build.
Question:
When will the changes that I tested today from branch release-1.0 be merged into the docker image cockroachdb/cockroach?
What is your workflow / update cycle of docker images?
Looks for me that at the moment docker tag latest = docker tag release-1.0 != git branch release-1.0
thanks for clarification

@knz
Copy link
Contributor

knz commented Jun 13, 2017

@HeikoOnnebrink I am delighted that Jordan's fix has alleviated your problem. The fix will be released in CockroachDB 1.0.3. The precise release date is not known yet, but usually we do stable releases every 2-3 weeks. We will update the Docker images at the time of release.

@jordanlewis color me impressed. I still currently have zero idea where the AST gets reused in 1.0.

@HeikoOnnebrink
Copy link

@knz just to understand it right... when 1.0.3 will be released which docker image will be updated?
latest + release-1.0 or will a new image with tag v1.0.3 added or all of them ?

@knz
Copy link
Contributor

knz commented Jun 13, 2017

I'm not sure. @benesch can you chime in?

@bdarnell
Copy link
Contributor

bdarnell commented Jun 13, 2017

Each new release will push to latest, release-1.0, and v1.0.x (but we're still working out some kinks in this process; release-1.0 has already been updated prematurely with part of what's being released this week as 1.0.2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting
Projects
None yet
Development

No branches or pull requests

6 participants