Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/able to retry all connections #194

Merged
merged 10 commits into from
Aug 13, 2021
Merged

Feature/able to retry all connections #194

merged 10 commits into from
Aug 13, 2021

Conversation

gregingenii
Copy link
Contributor

resolves #193

Description

Adds a 'retry_all' configuration boolean to override the behaviour that only 'retryable' connection errors are attempted again.

Help with which tests need extending or how this can be tested is greatly appreciated.

Checklist

  • [/] I have signed the CLA
  • [/] I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • [/] I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Jul 22, 2021
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @gregingenii! This looks like a clever extension of the workaround we put in place, long ago, to handle long cluster startup times over HTTP.

Could you say just a bit more about:

  • The specific connection issue you see transiently
  • Which connection method you're using when you see it
  • How long/how well this fix has been working for you locally

I don't have a great sense of how we'd want to add a test for this, since I'm not sure we can reliably reproduce a transient connection issue. The code change itself looks quite straightforward, so I'm leaning toward merging it as is.

@gregingenii
Copy link
Contributor Author

Thanks for coming back @jtcohen6 . Answers to your questions:

  • The error we were seeing was the ever helpful and explanatory 'Database Error':
2021-07-19 14:57:47.225853 (MainThread): 
2021-07-19 14:57:47.225957 (MainThread): [Errno 0] Error
2021-07-19 14:57:47.226050 (MainThread): 
2021-07-19 14:57:47.226140 (MainThread): Runtime Error in test source_not_null_<test name> (models/<model>/schema.yml)
2021-07-19 14:57:47.226223 (MainThread):   Database Error
2021-07-19 14:57:47.226359 (MainThread):     failed to connect
2021-07-19 14:57:47.226446 (MainThread): 
2021-07-19 14:57:47.226534 (MainThread): Runtime Error in test source_not_null_<test name>_ (models/<model>/schema.yml)
2021-07-19 14:57:47.226614 (MainThread):   Database Error
2021-07-19 14:57:47.226686 (MainThread):     failed to connect
2021-07-19 14:57:47.226762 (MainThread): 
2021-07-19 14:57:47.226843 (MainThread): [Errno 0] Error
  • We're connecting with http
  • This fix is working well. We've been running with it for a couple of months now with no issues.

If there isn't already a testing framework for connection issues to add on to, I would like to duck implementing one if that's okay. I can also see that a couple of conflicts have arisen so I'll resolve those.

Thank you for your help.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate that context @gregingenii. I just tried this out locally by setting host to 0.0.0.0, it worked great.

The only slight pause I have: For the traditional retry (because a cluster is spinning up), it made sense to have a connect_timeout of 30 seconds or so. In terms of handling query transience, I imagine you'd want this to be more like 5 seconds. I just wanted to mention—I don't think it's a big deal, and I'd much rather reuse the existing parameter that serves the same basic purpose.

Two tiny comments, then this is good to go.

dbt/adapters/spark/connections.py Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
gregingenii and others added 2 commits August 13, 2021 17:34
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregingenii Thanks for this contribution!

@jtcohen6 jtcohen6 merged commit c6817cf into dbt-labs:master Aug 13, 2021
@gregingenii gregingenii deleted the feature/able_to_retry_all_connections branch August 13, 2021 18:04
leahwicz pushed a commit that referenced this pull request Aug 20, 2021
* Code changes

* README changes

* Improve error message default

* Changelog

* Changelog corrections

* Restore accidental deletion

* Update dbt/adapters/spark/connections.py

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

* Add myself to Contributors

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
leahwicz pushed a commit that referenced this pull request Aug 20, 2021
* Code changes

* README changes

* Improve error message default

* Changelog

* Changelog corrections

* Restore accidental deletion

* Update dbt/adapters/spark/connections.py

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

* Add myself to Contributors

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
@leahwicz leahwicz mentioned this pull request Aug 20, 2021
leahwicz added a commit that referenced this pull request Aug 20, 2021
* Code changes

* README changes

* Improve error message default

* Changelog

* Changelog corrections

* Restore accidental deletion

* Update dbt/adapters/spark/connections.py

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

* Add myself to Contributors

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

Co-authored-by: gregingenii <80900458+gregingenii@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
leahwicz added a commit that referenced this pull request Sep 17, 2021
* Show more detailed feedback when pyodbc import fails (#192)

* Use exception chaining to get more detailed feedback when pyodbc is not installed

* Remove pyodbc referenced before assignment

* Set back try except

* Add flake ignore

* Add error message to RunTimeException

Error chaining does not show the message in `dbt debug`. Therefore we
explicitly add the message to the dbt.exceptions.RunTimeException

* Update change log

Add to change log that we print the import error when pyodbc can not be imported

* Fix parenthesis in change log

* Update changelog [skip ci]

* Update changelog [skip ci]

* Update changelog [skip ci]

* Add support for ODBC Server Side Parameters (#201)

* Add support for ODBC Server Side Parameters

* Update CHANGELOG

* Feature/able to retry all connections (#194)

* Code changes

* README changes

* Improve error message default

* Changelog

* Changelog corrections

* Restore accidental deletion

* Update dbt/adapters/spark/connections.py

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

* Add myself to Contributors

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

* fixed get_columns_in_relation for open source delta table (#207)

* fixed get_columns_in_relation for open source delta table

* fixed E501 linting error and added change log

* fix issue parsing structs (#204)

* fix issue parsing structs

* include contributor in changelog

* better error explanation

Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>

* Add adapter unique_field (#211)

* Add adapter unique_field

* Fix flake8. Add changelog entry

* [Snyk] Fix for 2 vulnerabilities (#214)

* fix: requirements.txt to reduce vulnerabilities


The following vulnerabilities are fixed by pinning transitive dependencies:
- https://snyk.io/vuln/SNYK-PYTHON-SQLPARSE-1584201
- https://snyk.io/vuln/SNYK-PYTHON-THRIFT-474615

* Removing Thrift conflict with versions over 12

Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>

* 0.21.0b1 Release

* Bumping version to 0.21.0b1

* Bumping to 0.21.0b2

* Bumping version to 0.21.0b2

Co-authored-by: Cor <jczuurmond@protonmail.com>
Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>
Co-authored-by: Jethro Nederhof <jethro@jethron.id.au>
Co-authored-by: gregingenii <80900458+gregingenii@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
Co-authored-by: Hariharan Banukumar <hariharan.banukumar@gmail.com>
Co-authored-by: Sergio <ingscc00@gmail.com>
Co-authored-by: Snyk bot <github+bot@snyk.io>
Co-authored-by: Ian Knox <ian.knox@fishtownanalytics.com>
Co-authored-by: Leah Antkiewicz <leah.antkiewicz@dbtlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configuration setting so all connection errors are retried
2 participants