Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentry: error.go:78: unexpected error from the vectorized engine: node unavailable; try another peer (1) attached stack trace -- stack trace: | github.com/cockroachdb/cockroach/pkg/sql/colflow/col... #132799

Closed
cockroach-sentry opened this issue Oct 17, 2024 · 1 comment
Labels
branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-sentry Originated from an in-the-wild panic report. T-sql-queries SQL Queries Team X-blathers-triaged blathers was able to find an owner

Comments

@cockroach-sentry
Copy link
Collaborator

cockroach-sentry commented Oct 17, 2024

This issue was auto filed by Sentry. It represents a crash or reported error on a live cluster with telemetry enabled.

Sentry Link: https://cockroach-labs.sentry.io/issues/5997469397/?referrer=webhooks_plugin

Panic Message:

error.go:78: unexpected error from the vectorized engine: node unavailable; try another peer
(1) attached stack trace
  -- stack trace:
  | github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc.(*Inbox).Next.func1
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/inbox.go:324
  | runtime.gopanic
  | 	GOROOT/src/runtime/panic.go:770
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.ExpectedError
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:299
  | github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc.(*Inbox).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/inbox.go:385
  | github.com/cockroachdb/cockroach/pkg/sql/colexecop.(*noopOperator).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecop/operator.go:430
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*BatchFlowCoordinator).nextAdapter
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:246
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*BatchFlowCoordinator).next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:250
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*BatchFlowCoordinator).Run
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:282
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*vectorizedFlow).Run
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:320
  | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).Run
  | 	github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:932
  | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRun
  | 	github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1994
  | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRunAll.func3
  | 	github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1708
  | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRunAll
  | 	github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1711
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithDistSQLEngine
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:2420
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1967
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1174
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt.func1
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:146
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithProfiling
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:3429
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:145
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execPortal
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:251
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd.func2
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2421
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2423
  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2238
  | github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn
  | 	github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:963
  | github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommands
  | 	github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:256
  | github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*Server).serveImpl.func4
  | 	github.com/cockroachdb/cockroach/pkg/sql/pgwire/server.go:1136
  | runtime.goexit
  | 	src/runtime/asm_amd64.s:1695
Wraps: (2)
Wraps: (3) tags: [n2,client=100.64.3.253:39310,hostnossl,user=×,f×,distsql.stmt=×,distsql.gateway=2,distsql.appname=×,distsql.txn=×,streamID=×,received-error]
Wraps: (4) tags: [n×,f×,distsql.stmt=×,distsql.gateway=×,distsql.appname=×,distsql.txn=×,streamID=×,sent-error=]
Wraps: (5) assertion failure
Wraps: (6)
  | (opaque error wrapper)
  | type name: github.com/cockroachdb/errors/withstack/*withstack.withStack
  | reportable 0:
  |
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError.func1
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:78
  | runtime.gopanic
  | 	GOROOT/src/runtime/panic.go:770
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.InternalError
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:291
  | github.com/cockroachdb/cockroach/pkg/sql/colfetcher.(*ColIndexJoin).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colfetcher/index_join.go:226
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*CancelChecker).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/cancel_checker.go:59
  | github.com/cockroachdb/cockroach/pkg/sql/colexec.(*allSpooler).spool
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/sort.go:136
  | github.com/cockroachdb/cockroach/pkg/sql/colexec.(*sortOp).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/sort.go:280
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk.(*diskSpillerBase).Next.func1
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk/disk_spiller.go:202
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk.(*diskSpillerBase).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk/disk_spiller.go:200
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*simpleProjectOp).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/simple_project.go:124
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*deselectorOp).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/deselector.go:53
  | github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc.(*Outbox).sendBatches.func1
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:273
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc.(*Outbox).sendBatches
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:264
  | github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc.(*Outbox).runWithStream
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:402
  | github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc.(*Outbox).Run
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:225
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*vectorizedFlowCreator).setupRemoteOutputStream.func1
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:766
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*vectorizedFlowCreator).setupRemoteOutputStream.(*vectorizedFlowCreator).accumulateAsyncComponent.func2.1
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:728
  | runtime.goexit
  | 	src/runtime/asm_amd64.s:1695
Wraps: (7) unexpected error from the vectorized engine
Wraps: (8) node unavailable; try another peer
  | -- cause hidden behind barrier
  | node unavailable; try another peer
  | (1) node unavailable; try another peer
  | Error types: (1) *kvpb.NodeUnavailableError
Error types: (1) *withstack.withStack (2) *colexecerror.notInternalError (3) *contexttags.withContext (4) *contexttags.withContext (5) *assert.withAssertionFailure (6) *errbase.opaqueWrapper (7) *errutil.withPrefix (8) *barriers.barrierErr
-- report composition:
*barriers.barrierErr...
Stacktrace (expand for inline code snippets):

src/runtime/asm_amd64.s#L1694-L1696

defer procWg.Done()
c.processCommands(
ctx,

reservedOwned = false // We're about to pass ownership away.
retErr = sqlServer.ServeConn(
ctx,

}(ctx, h)
return h.ex.run(ctx, s.pool, reserved, cancel)
}

var err error
if err = ex.execCmd(); err != nil {
// Both of these errors are normal ways for the connExecutor to exit.

return err
}()
// Note: we write to ex.statsCollector.phaseTimes, instead of ex.phaseTimes,

canAutoCommit := ex.implicitTxn() && tcmd.FollowedBySync
ev, payload, err = ex.execPortal(ctx, portal, portalName, stmtRes, pinfo, canAutoCommit)
return err

}
ev, payload, retErr = ex.execStmt(ctx, portal.Stmt.Statement, &portal, pinfo, stmtRes, canAutoCommit)
// For a non-pausable portal, it is considered exhausted regardless of the

}
err = ex.execWithProfiling(ctx, ast, preparedStmt, func(ctx context.Context) error {
ev, payload, err = ex.execStmtInOpenState(ctx, parserStmt, portal, pinfo, res, canAutoCommit)

} else {
err = op(ctx)
}

err = ex.execWithProfiling(ctx, ast, preparedStmt, func(ctx context.Context) error {
ev, payload, err = ex.execStmtInOpenState(ctx, parserStmt, portal, pinfo, res, canAutoCommit)
return err

} else {
if err := ex.dispatchToExecutionEngine(stmtCtx, p, res); err != nil {
stmtThresholdSpan.Finish()

ex.sessionTracing.TraceExecStart(ctx, "distributed")
stats, err := ex.execWithDistSQLEngine(
ctx, planner, stmt.AST.StatementReturnType(), res, distribute, progAtomic, distSQLProhibitedErr,

}
err = ex.server.cfg.DistSQLPlanner.PlanAndRunAll(ctx, evalCtx, planCtx, planner, recv, evalCtxFactory)
}

)
}()

defer cleanup()
dsp.PlanAndRun(
ctx, evalCtx, planCtx, planner.txn, planner.curPlan.main, recv, finishedSetupFn,

recv.expectedRowsRead = int64(physPlan.TotalEstimatedScannedRows)
dsp.Run(ctx, planCtx, txn, physPlan, recv, evalCtx, finishedSetupFn)
}

noWait := planCtx.getPortalPauseInfo() != nil
flow.Run(ctx, noWait)
}

log.VEvent(ctx, 1, "running the batch flow coordinator in the flow's goroutine")
f.batchFlowCoordinator.Run(ctx)
}

for status == execinfra.NeedMoreRows {
err := f.next()
if err != nil {

func (f *BatchFlowCoordinator) next() error {
return colexecerror.CatchVectorizedRuntimeError(f.nextAdapter)
}

}()
operation()
return retErr

func (f *BatchFlowCoordinator) nextAdapter() {
f.batch = f.input.Root.Next()
}

func (n *noopOperator) Next() coldata.Batch {
return n.Input.Next()
}

if receivedErr != nil {
colexecerror.ExpectedError(receivedErr)
}

func ExpectedError(err error) {
panic(newNotInternalError(err))
}

GOROOT/src/runtime/panic.go#L769-L771
}
err := logcrash.PanicAsError(0, panicObj)
log.VEventf(i.Ctx, 1, "Inbox encountered an error in Next: %v", err)

src/runtime/asm_amd64.s#L1694-L1696
defer wg.Done()
run(ctx, flowCtxCancel)
}()

run := func(ctx context.Context, flowCtxCancel context.CancelFunc) {
outbox.Run(
ctx,

log.VEvent(ctx, 2, "Outbox starting normal operation")
o.runWithStream(ctx, stream, flowCtxCancel, outboxCtxCancel)
log.VEvent(ctx, 2, "Outbox exiting")

terminatedGracefully, errToSend := o.sendBatches(ctx, stream, flowCtxCancel, outboxCtxCancel)
if terminatedGracefully || errToSend != nil {

}
errToSend = colexecerror.CatchVectorizedRuntimeError(func() {
o.Input.Init(o.runnerCtx)

}()
operation()
return retErr

batch := o.Input.Next()
n := batch.Length()

func (p *deselectorOp) Next() coldata.Batch {
batch := p.Input.Next()
if batch.Selection() == nil || batch.Length() == 0 {

func (d *simpleProjectOp) Next() coldata.Batch {
batch := d.Input.Next()
if batch.Length() == 0 {

var batch coldata.Batch
if err := colexecerror.CatchVectorizedRuntimeError(
func() {

}()
operation()
return retErr

func() {
batch = d.inMemoryOp.Next()
},

case sortSpooling:
p.input.spool()
p.state = sortSorting

p.spooled = true
for batch := p.Input.Next(); batch.Length() != 0; batch = p.Input.Next() {
p.bufferedTuples.AppendTuples(batch, 0 /* startIdx */, batch.Length())

c.CheckEveryCall()
return c.Input.Next()
}

); err != nil {
colexecerror.InternalError(err)
}

func InternalError(err error) {
panic(newInternalError(err))
}

GOROOT/src/runtime/panic.go#L769-L771
if code := pgerror.GetPGCode(retErr); code == pgcode.Uncategorized {
retErr = errors.NewAssertionErrorWithWrappedErrf(
retErr, "unexpected error from the vectorized engine",

src/runtime/asm_amd64.s in runtime.goexit at line 1695
pkg/sql/pgwire/server.go in pkg/sql/pgwire.(*Server).serveImpl.func4 at line 1136
pkg/sql/pgwire/conn.go in pkg/sql/pgwire.(*conn).processCommands at line 256
pkg/sql/conn_executor.go in pkg/sql.(*Server).ServeConn at line 963
pkg/sql/conn_executor.go in pkg/sql.(*connExecutor).run at line 2238
pkg/sql/conn_executor.go in pkg/sql.(*connExecutor).execCmd at line 2423
pkg/sql/conn_executor.go in pkg/sql.(*connExecutor).execCmd.func2 at line 2421
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execPortal at line 251
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execStmt at line 145
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execWithProfiling at line 3429
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execStmt.func1 at line 146
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execStmtInOpenState at line 1174
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).dispatchToExecutionEngine at line 1967
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execWithDistSQLEngine at line 2420
pkg/sql/distsql_running.go in pkg/sql.(*DistSQLPlanner).PlanAndRunAll at line 1711
pkg/sql/distsql_running.go in pkg/sql.(*DistSQLPlanner).PlanAndRunAll.func3 at line 1708
pkg/sql/distsql_running.go in pkg/sql.(*DistSQLPlanner).PlanAndRun at line 1994
pkg/sql/distsql_running.go in pkg/sql.(*DistSQLPlanner).Run at line 932
pkg/sql/colflow/vectorized_flow.go in pkg/sql/colflow.(*vectorizedFlow).Run at line 320
pkg/sql/colflow/flow_coordinator.go in pkg/sql/colflow.(*BatchFlowCoordinator).Run at line 282
pkg/sql/colflow/flow_coordinator.go in pkg/sql/colflow.(*BatchFlowCoordinator).next at line 250
pkg/sql/colexecerror/error.go in pkg/sql/colexecerror.CatchVectorizedRuntimeError at line 152
pkg/sql/colflow/flow_coordinator.go in pkg/sql/colflow.(*BatchFlowCoordinator).nextAdapter at line 246
pkg/sql/colexecop/operator.go in pkg/sql/colexecop.(*noopOperator).Next at line 430
pkg/sql/colflow/colrpc/inbox.go in pkg/sql/colflow/colrpc.(*Inbox).Next at line 385
pkg/sql/colexecerror/error.go in pkg/sql/colexecerror.ExpectedError at line 299
GOROOT/src/runtime/panic.go in runtime.gopanic at line 770
pkg/sql/colflow/colrpc/inbox.go in pkg/sql/colflow/colrpc.(*Inbox).Next.func1 at line 324
src/runtime/asm_amd64.s in runtime.goexit at line 1695
pkg/sql/colflow/vectorized_flow.go in pkg/sql/colflow.(*vectorizedFlowCreator).setupRemoteOutputStream.(*vectorizedFlowCreator).accumulateAsyncComponent.func2.1 at line 728
pkg/sql/colflow/vectorized_flow.go in pkg/sql/colflow.(*vectorizedFlowCreator).setupRemoteOutputStream.func1 at line 766
pkg/sql/colflow/colrpc/outbox.go in pkg/sql/colflow/colrpc.(*Outbox).Run at line 225
pkg/sql/colflow/colrpc/outbox.go in pkg/sql/colflow/colrpc.(*Outbox).runWithStream at line 402
pkg/sql/colflow/colrpc/outbox.go in pkg/sql/colflow/colrpc.(*Outbox).sendBatches at line 264
pkg/sql/colexecerror/error.go in pkg/sql/colexecerror.CatchVectorizedRuntimeError at line 152
pkg/sql/colflow/colrpc/outbox.go in pkg/sql/colflow/colrpc.(*Outbox).sendBatches.func1 at line 273
pkg/sql/colexec/colexecutils/deselector.go in pkg/sql/colexec/colexecutils.(*deselectorOp).Next at line 53
pkg/sql/colexec/colexecbase/simple_project.go in pkg/sql/colexec/colexecbase.(*simpleProjectOp).Next at line 124
pkg/sql/colexec/colexecdisk/disk_spiller.go in pkg/sql/colexec/colexecdisk.(*diskSpillerBase).Next at line 200
pkg/sql/colexecerror/error.go in pkg/sql/colexecerror.CatchVectorizedRuntimeError at line 152
pkg/sql/colexec/colexecdisk/disk_spiller.go in pkg/sql/colexec/colexecdisk.(*diskSpillerBase).Next.func1 at line 202
pkg/sql/colexec/sort.go in pkg/sql/colexec.(*sortOp).Next at line 280
pkg/sql/colexec/sort.go in pkg/sql/colexec.(*allSpooler).spool at line 136
pkg/sql/colexec/colexecutils/cancel_checker.go in pkg/sql/colexec/colexecutils.(*CancelChecker).Next at line 59
pkg/sql/colfetcher/index_join.go in pkg/sql/colfetcher.(*ColIndexJoin).Next at line 226
pkg/sql/colexecerror/error.go in pkg/sql/colexecerror.InternalError at line 291
GOROOT/src/runtime/panic.go in runtime.gopanic at line 770
pkg/sql/colexecerror/error.go in pkg/sql/colexecerror.CatchVectorizedRuntimeError.func1 at line 78

Tags

Tag Value
Command server
Environment v24.2.3
Go Version go1.22.5 X:nocoverageredesign
Platform linux amd64
Distribution CCL
Cockroach Release v24.2.3
Cockroach SHA 217b43e
# of CPUs 4
# of Goroutines 373

Jira issue: CRDB-43283

@cockroach-sentry cockroach-sentry added branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-sentry Originated from an in-the-wild panic report. labels Oct 17, 2024
@yuzefovich yuzefovich added T-sql-queries SQL Queries Team X-blathers-triaged blathers was able to find an owner labels Nov 12, 2024
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Nov 12, 2024
@yuzefovich
Copy link
Member

dup of #128649

@github-project-automation github-project-automation bot moved this from Triage to Done in SQL Queries Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-sentry Originated from an in-the-wild panic report. T-sql-queries SQL Queries Team X-blathers-triaged blathers was able to find an owner
Projects
Status: Done
Development

No branches or pull requests

2 participants