Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporter: Add retries for Search, ReadContext and Import operations when importing the resource #3202

Merged
merged 4 commits into from
Feb 2, 2024

Conversation

alexott
Copy link
Contributor

@alexott alexott commented Feb 1, 2024

Changes

Under a very high load, Databricks backend may not answer on time, or return specific errors, so it makes sense to retry operation few times.

This PR uses "naive" implementation, I need to play a bit more with retries package before adopting it

Tests

  • make test run locally
  • relevant change in docs/ folder
  • covered with integration tests in internal/acceptance
  • relevant acceptance tests are passing
  • using Go SDK

…en importing the resource

Under a very high load, Databricks backend may not answer on time, or return specific
errors, so it makes sense to retry operation few times.

This PR uses "naive" implementation, I need to play a bit more with `retries` package
before adopting it
@alexott alexott requested review from a team as code owners February 1, 2024 17:12
@alexott alexott requested review from mgyucht and removed request for a team February 1, 2024 17:12
@alexott
Copy link
Contributor Author

alexott commented Feb 1, 2024

@mgyucht @tanmay-db It would be really useful to get this merged before the release...

@codecov-commenter
Copy link

codecov-commenter commented Feb 1, 2024

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (d3acc7b) 83.57% compared to head (6141282) 83.58%.
Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3202      +/-   ##
==========================================
+ Coverage   83.57%   83.58%   +0.01%     
==========================================
  Files         168      168              
  Lines       15021    15063      +42     
==========================================
+ Hits        12554    12591      +37     
- Misses       1729     1733       +4     
- Partials      738      739       +1     
Files Coverage Δ
exporter/context.go 81.46% <100.00%> (-0.15%) ⬇️
exporter/model.go 83.46% <92.30%> (+1.26%) ⬆️
exporter/util.go 80.08% <80.95%> (+0.14%) ⬆️

... and 3 files with indirect coverage changes

@tanmay-db
Copy link
Contributor

looking

exporter/model.go Outdated Show resolved Hide resolved
exporter/model.go Outdated Show resolved Hide resolved
exporter/model.go Outdated Show resolved Hide resolved
@alexott alexott force-pushed the exporter-retries-under-high-load branch from b126bc0 to 6fec3ff Compare February 1, 2024 17:39
@alexott alexott changed the title Exporter: Add retries for Search, ReadContext and Import operations when importing the resource DONT MERGE YET: Exporter: Add retries for Search, ReadContext and Import operations when importing the resource Feb 1, 2024
@alexott alexott changed the title DONT MERGE YET: Exporter: Add retries for Search, ReadContext and Import operations when importing the resource Exporter: Add retries for Search, ReadContext and Import operations when importing the resource Feb 2, 2024
@alexott alexott requested a review from tanmay-db February 2, 2024 09:09
@@ -758,7 +758,7 @@ func (ic *importContext) generateAndWriteResources(sh *os.File) {
for i, r := range resources {
ic.waitGroup.Add(1)
resourcesChan <- r
if i%50 == 0 {
if i%500 == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing this is because there are too many items in logs so we want to minimise them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's just too much

return false
}

func runWithRetries[ERR any](runFunc func() ERR, msg string) ERR {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest looks good to me, only concern is that we can use https://github.com/databricks/databricks-sdk-go/blob/main/retries/retries.go here otherwise, we would have to maintain this retry mechanism separately.

If the change required is urgent and using retries would require major refactoring, then we can go ahead with this. What do you think? @mgyucht

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retries package is a convenience, not a core part of the SDK. The retries package does not expose anything to e.g. print when a retry is taking place or print a specific message after including the total number of retries that have happened, so if @alexott needs these, it's perfectly fine for him to implement his own retry logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though @alexott curious if we added things like WithMaxRetries or WithRetryOnErrorSubstrings if that would satisfy your need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, let discuss if we can make something generic...

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem

@mgyucht mgyucht added this pull request to the merge queue Feb 2, 2024
Merged via the queue into main with commit 3f8144f Feb 2, 2024
5 checks passed
@mgyucht mgyucht deleted the exporter-retries-under-high-load branch February 2, 2024 13:49
tanmay-db added a commit that referenced this pull request Feb 5, 2024
### New Features and Improvements
* Exporter: timestamps are now added to log entries ([#3146](#3146)).
* Validate metastore id for databricks_grant and databricks_grants resources ([#3159](#3159)).
* Exporter: Skip emitting of clusters that come from more cluster sources ([#3161](#3161)).
* Fix typo in docs ([#3166](#3166)).
* Migrate cluster schema to use the go-sdk struct ([#3076](#3076)).
* Introduce Generic Settings Resource ([#2997](#2997)).
* Update actions/setup-go to v5 ([#3154](#3154)).
* Change default branch from `master` to `main` ([#3174](#3174)).
* Add .codegen.json configuration ([#3180](#3180)).
* Exporter: performance improvements for big workspaces ([#3167](#3167)).
* update ([#3192](#3192)).
* Exporter: fix generation of cluster policy resources ([#3185](#3185)).
* Fix unit test ([#3201](#3201)).
* Suppress diff should apply to new fields added in the same chained call to CustomizableSchema ([#3200](#3200)).
* Various documentation updates ([#3198](#3198)).
* Use common.Resource consistently throughout the provider ([#3193](#3193)).
* Extending customizable schema with `AtLeastOneOf`, `ExactlyOneOf`, `RequiredWith` ([#3182](#3182)).
* Fix `databricks_connection` regression when creating without owner ([#3186](#3186)).
* add test code for job task order ([#3183](#3183)).
* Allow using empty strings as job parameters ([#3158](#3158)).
* Fix notebook parameters in acceptance test ([#3205](#3205)).
* Exporter: Add retries for `Search`, `ReadContext` and `Import` operations when importing the resource ([#3202](#3202)).
* Fixed updating owners for UC resources ([#3189](#3189)).
* Adds `databricks_volumes` as data source  ([#3150](#3150)).

### Documentation Changes

### Exporter

### Internal Changes
@tanmay-db tanmay-db mentioned this pull request Feb 5, 2024
github-merge-queue bot pushed a commit that referenced this pull request Feb 6, 2024
* Release v1.35.1

### New Features and Improvements
* Exporter: timestamps are now added to log entries ([#3146](#3146)).
* Validate metastore id for databricks_grant and databricks_grants resources ([#3159](#3159)).
* Exporter: Skip emitting of clusters that come from more cluster sources ([#3161](#3161)).
* Fix typo in docs ([#3166](#3166)).
* Migrate cluster schema to use the go-sdk struct ([#3076](#3076)).
* Introduce Generic Settings Resource ([#2997](#2997)).
* Update actions/setup-go to v5 ([#3154](#3154)).
* Change default branch from `master` to `main` ([#3174](#3174)).
* Add .codegen.json configuration ([#3180](#3180)).
* Exporter: performance improvements for big workspaces ([#3167](#3167)).
* update ([#3192](#3192)).
* Exporter: fix generation of cluster policy resources ([#3185](#3185)).
* Fix unit test ([#3201](#3201)).
* Suppress diff should apply to new fields added in the same chained call to CustomizableSchema ([#3200](#3200)).
* Various documentation updates ([#3198](#3198)).
* Use common.Resource consistently throughout the provider ([#3193](#3193)).
* Extending customizable schema with `AtLeastOneOf`, `ExactlyOneOf`, `RequiredWith` ([#3182](#3182)).
* Fix `databricks_connection` regression when creating without owner ([#3186](#3186)).
* add test code for job task order ([#3183](#3183)).
* Allow using empty strings as job parameters ([#3158](#3158)).
* Fix notebook parameters in acceptance test ([#3205](#3205)).
* Exporter: Add retries for `Search`, `ReadContext` and `Import` operations when importing the resource ([#3202](#3202)).
* Fixed updating owners for UC resources ([#3189](#3189)).
* Adds `databricks_volumes` as data source  ([#3150](#3150)).

### Documentation Changes

### Exporter

### Internal Changes

* upd

* readable

* upd

* upd
@alexott alexott added the exporter TF configuration generator label Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter TF configuration generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants