[CT-791] intermittent database connection error #129

yduan-polo · 2022-06-29T17:55:52Z

Describe the bug

We've seen some intermittent database connection error with Redshift, something like

connection to server at "redshift cluster domain name", port 5439 failed: timeout expired

I've been playing with two parameters in profile.yml

iam_duration_seconds
keepalives_idle (set it a small value to trigger a pulse more frequently to keep it alive)

and it's been less frequent since. But I don't fully understand how connection is managed by DBT: is it a new connection for each model, or available connection from a pool?

Steps To Reproduce

No easy way to reproduce

Expected behavior

No connection error unless it's a network error.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

The output of dbt --version:

installed version: 0.19.1
   latest version: 1.0.0

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - postgres: 0.19.1
  - redshift: 0.19.1
  - snowflake: 0.19.1
  - bigquery: 0.19.1

The operating system you're using:

Amazon Linux docker container running on ecs cluster

The output of python --version:

Python 3.8.10

Additional context

We connect to Redshift using IAM role authentication.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2022-07-05T13:28:56Z

@yduan-polo Thanks for opening! We've been seeing this error more recently ourselves, in daily CI testing.

I believe that, when running on Redshift, dbt does open a new connection for each model, uses it for the duration of that model's run, and closes it once the model has finished running. While the number of connections is quite relevant to concurrency limits on Redshift clusters, I'm not sure it's the most salient detail for debugging connection timeouts.

I'm hopeful that some in-progress work adding retry logic to dbt's connection mechanism for Postgres/Redshift would help us out here. That feels like the most promising thread to keep pulling on: #96, dbt-labs/dbt-core#5022, dbt-labs/dbt-core#5432

jtcohen6 · 2022-07-05T14:57:03Z

Going to close as a duplicate of #96, to keep the conversation centralized over there. This is definitely on our radar

yduan-polo added bug Something isn't working triage labels Jun 29, 2022

github-actions bot changed the title ~~intermittent database connection error~~ [CT-791] intermittent database connection error Jun 29, 2022

jtcohen6 removed the triage label Jul 5, 2022

jtcohen6 added the duplicate label Jul 5, 2022

jtcohen6 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-791] intermittent database connection error #129

[CT-791] intermittent database connection error #129

yduan-polo commented Jun 29, 2022 •

edited

Loading

jtcohen6 commented Jul 5, 2022

jtcohen6 commented Jul 5, 2022

[CT-791] intermittent database connection error #129

[CT-791] intermittent database connection error #129

Comments

yduan-polo commented Jun 29, 2022 • edited Loading

Describe the bug

Steps To Reproduce

Expected behavior

Screenshots and log output

System information

Additional context

jtcohen6 commented Jul 5, 2022

jtcohen6 commented Jul 5, 2022

yduan-polo commented Jun 29, 2022 •

edited

Loading