Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-791] intermittent database connection error #129

Closed
yduan-polo opened this issue Jun 29, 2022 · 2 comments
Closed

[CT-791] intermittent database connection error #129

yduan-polo opened this issue Jun 29, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@yduan-polo
Copy link

yduan-polo commented Jun 29, 2022

Describe the bug

We've seen some intermittent database connection error with Redshift, something like

connection to server at "redshift cluster domain name", port 5439 failed: timeout expired

I've been playing with two parameters in profile.yml

iam_duration_seconds
keepalives_idle (set it a small value to trigger a pulse more frequently to keep it alive)

and it's been less frequent since. But I don't fully understand how connection is managed by DBT: is it a new connection for each model, or available connection from a pool?

Steps To Reproduce

No easy way to reproduce

Expected behavior

No connection error unless it's a network error.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

The output of dbt --version:

installed version: 0.19.1
   latest version: 1.0.0

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - postgres: 0.19.1
  - redshift: 0.19.1
  - snowflake: 0.19.1
  - bigquery: 0.19.1

The operating system you're using:

Amazon Linux docker container running on ecs cluster

The output of python --version:

Python 3.8.10

Additional context

We connect to Redshift using IAM role authentication.

@yduan-polo yduan-polo added bug Something isn't working triage labels Jun 29, 2022
@github-actions github-actions bot changed the title intermittent database connection error [CT-791] intermittent database connection error Jun 29, 2022
@jtcohen6 jtcohen6 removed the triage label Jul 5, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jul 5, 2022

@yduan-polo Thanks for opening! We've been seeing this error more recently ourselves, in daily CI testing.

I believe that, when running on Redshift, dbt does open a new connection for each model, uses it for the duration of that model's run, and closes it once the model has finished running. While the number of connections is quite relevant to concurrency limits on Redshift clusters, I'm not sure it's the most salient detail for debugging connection timeouts.

I'm hopeful that some in-progress work adding retry logic to dbt's connection mechanism for Postgres/Redshift would help us out here. That feels like the most promising thread to keep pulling on: #96, dbt-labs/dbt-core#5022, dbt-labs/dbt-core#5432

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jul 5, 2022

Going to close as a duplicate of #96, to keep the conversation centralized over there. This is definitely on our radar

@jtcohen6 jtcohen6 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants