You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With dbt's new approach to multithreading, exceptions raised in runner.call_runner cause the thread pool to hang. I'm having a hard time finding the expected behavior of a thread pool when a worker raises an exception, but it appears that 1) the exception message is swallowed and 2) the pool ends up blocking on the next get() call.
I can think of two ways to work around this:
methods called by call_runner should never raise
a wrapper function around call_runner should be provided which logs exceptions but does not raise them
This is the general idea behind the bug, but the specific situation where is related to the raise_on_first_error attribute of certain node runners. The CompileRunner has this value set to True, so exceptions during compilation cause an exception to be raise in the call_runner method. No other runners have raise_on_first_error enabled, so I'm only able to reproduce this bug with dbt compile (and dbt docs generate, which invokes dbt compile).
In addition to ensuring that call_runner never raises (or handling the exceptions in the pool, if that's possible), we should do something about raise_on_first_error. I'm unsure how useful this attribute is in its current incarnation, and I think I'd be ok with just removing it. The other option is to keep that attribute, and then flip some flag that ultimately terminates the pool from the main thread.
Results
The thread pool blocked on a get() call after an exception was raised in dbt compile.
This bug manifests when a CompilationError exists at "compile time". Crucially though, most compile-time errors are also parse-time errors. Here's an example of a model that will succeed at parse-time (allowing dbt to proceed to compilation), but that will fail at compile time;
-- models/my_model.sql
{% if execute %}
select * from {{ ref('notfound') }}
{% else %}
select 1 as id
{% endif %}
Then run:
$ dbt compile
The text was updated successfully, but these errors were encountered:
Issue
Issue description
With dbt's new approach to multithreading, exceptions raised in
runner.call_runner
cause the thread pool to hang. I'm having a hard time finding the expected behavior of a thread pool when a worker raises an exception, but it appears that 1) the exception message is swallowed and 2) the pool ends up blocking on the nextget()
call.I can think of two ways to work around this:
call_runner
should never raisecall_runner
should be provided which logs exceptions but does not raise themThis is the general idea behind the bug, but the specific situation where is related to the raise_on_first_error attribute of certain node runners. The
CompileRunner
has this value set toTrue
, so exceptions during compilation cause an exception to be raise in thecall_runner
method. No other runners haveraise_on_first_error
enabled, so I'm only able to reproduce this bug withdbt compile
(anddbt docs generate
, which invokesdbt compile
).In addition to ensuring that
call_runner
never raises (or handling the exceptions in the pool, if that's possible), we should do something aboutraise_on_first_error
. I'm unsure how useful this attribute is in its current incarnation, and I think I'd be ok with just removing it. The other option is to keep that attribute, and then flip some flag that ultimately terminates the pool from the main thread.Results
The thread pool blocked on a
get()
call after an exception was raised indbt compile
.System information
Steps to reproduce
This bug manifests when a CompilationError exists at "compile time". Crucially though, most compile-time errors are also parse-time errors. Here's an example of a model that will succeed at parse-time (allowing dbt to proceed to compilation), but that will fail at compile time;
Then run:
The text was updated successfully, but these errors were encountered: