-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Set unique table suffix to allow parallel incremental executions #650
Conversation
The general idea of using unique tmp table location is good.
@pierrebzl I left few comments, please have a look. |
dbt/include/athena/macros/materializations/models/incremental/incremental.sql
Show resolved
Hide resolved
7a4520a
to
8e41849
Compare
Hi @nicor88,
|
8e41849
to
2658f77
Compare
2658f77
to
135ab4c
Compare
@Jrmyy @svdimchenko could you assist @pierrebzl? |
@pierrebzl could you lint your changes? Regarding functional testing, what you proposed seems fine. It's hard to intercept the tmp table name in the test, I'm fine with what you proposed |
135ab4c
to
54229d0
Compare
Yes sorry for this. Should be done: (venv) dbt-athena $ pre-commit run mypy --show-diff-on-failure --color=always --all-files
mypy.....................................................................Passed |
Hi @nicor88, let me know if anything missing to go ahead with this change. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good ✅
dbt/include/athena/macros/materializations/models/incremental/incremental.sql
Outdated
Show resolved
Hide resolved
54229d0
to
34a6f3e
Compare
@pierrebzl there are some conflicts in the Readme, due to the latest merge, could you please resolve that? |
34a6f3e
to
fc61a5e
Compare
@pierrebzl could you lint again? please, I left a tiny comment, looks good. |
Sorry for that. |
c1b663d
to
4a52129
Compare
I had to change the regex expression format: 4a52129#diff-58cf52251bceac0bf774fa9511051801f5c8d6b34e5af08e3af56dbe271c6df8R65-R68 as it got broke by the auto lintting. Should pass tests now. |
@pierrebzl thanks for fixing, I was wondering why the CI was failing, but it was a genuine failure, and I just re-trigger it. |
Yes, I think we are good now 🥳 |
@pierrebzl great job 💯 |
Thanks for your support! |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [dbt-athena-community](https://togithub.com/dbt-athena/dbt-athena) | minor | `==1.7.2` -> `==1.8.2` | --- ### Release Notes <details> <summary>dbt-athena/dbt-athena (dbt-athena-community)</summary> ### [`v1.8.2`](https://togithub.com/dbt-athena/dbt-athena/releases/tag/v1.8.2) [Compare Source](https://togithub.com/dbt-athena/dbt-athena/compare/v1.8.1...v1.8.2) ### What's Changed #### Fixes - fix: Add wait_random_exponential for query retries by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/655](https://togithub.com/dbt-athena/dbt-athena/pull/655) - fix: Resolve error when cloning Python models ([#​645](https://togithub.com/dbt-athena/dbt-athena/issues/645)) by [@​jeancochrane](https://togithub.com/jeancochrane) in [https://github.com/dbt-athena/dbt-athena/pull/651](https://togithub.com/dbt-athena/dbt-athena/pull/651) - fix: Fixed table_type for GOVERNED tables by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/661](https://togithub.com/dbt-athena/dbt-athena/pull/661) #### Features - feat: Set unique table suffix to allow parallel incremental executions by [@​pierrebzl](https://togithub.com/pierrebzl) in [https://github.com/dbt-athena/dbt-athena/pull/650](https://togithub.com/dbt-athena/dbt-athena/pull/650) - feat: Allow custom schema def for tmp tables generated by incremental by [@​pierrebzl](https://togithub.com/pierrebzl) in [https://github.com/dbt-athena/dbt-athena/pull/659](https://togithub.com/dbt-athena/dbt-athena/pull/659) - feat: Implement iceberg retry logic by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/657](https://togithub.com/dbt-athena/dbt-athena/pull/657) #### Dependencies - chore: Update moto requirement from ~=5.0.7 to ~=5.0.8 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/dbt-athena/dbt-athena/pull/660](https://togithub.com/dbt-athena/dbt-athena/pull/660) - chore: Bumped version to 1.8.2 for release by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/663](https://togithub.com/dbt-athena/dbt-athena/pull/663) #### Docs - docs: Cleanup README grammar, punctuation, and capitalisation by [@​dfsnow](https://togithub.com/dfsnow) in [https://github.com/dbt-athena/dbt-athena/pull/654](https://togithub.com/dbt-athena/dbt-athena/pull/654) #### New Contributors - [@​jeancochrane](https://togithub.com/jeancochrane) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/651](https://togithub.com/dbt-athena/dbt-athena/pull/651) - [@​pierrebzl](https://togithub.com/pierrebzl) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/650](https://togithub.com/dbt-athena/dbt-athena/pull/650) **Full Changelog**: dbt-labs/dbt-athena@v1.8.1...v1.8.2 ### [`v1.8.1`](https://togithub.com/dbt-athena/dbt-athena/releases/tag/v1.8.1) [Compare Source](https://togithub.com/dbt-athena/dbt-athena/compare/v1.7.2...v1.8.1) #### What's Changed ##### Relevant notes⚠️ 1.8.1 version is equivalent to 1.8.0 in term of features and fixes. You can install the changes from this release via pip install dbt-athena-community==1.8.1 ##### Features - feat: Add column meta to glue column parameters by [@​SoumayaMauthoorMOJ](https://togithub.com/SoumayaMauthoorMOJ) in [https://github.com/dbt-athena/dbt-athena/pull/644](https://togithub.com/dbt-athena/dbt-athena/pull/644) ##### Dependencies - chore: Update moto requirement from ~=5.0.6 to ~=5.0.7 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/dbt-athena/dbt-athena/pull/648](https://togithub.com/dbt-athena/dbt-athena/pull/648) ##### Docs - docs: Cleanup Python models section of README by [@​dfsnow](https://togithub.com/dfsnow) in [https://github.com/dbt-athena/dbt-athena/pull/643](https://togithub.com/dbt-athena/dbt-athena/pull/643) #### New Contributors - [@​dfsnow](https://togithub.com/dfsnow) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/643](https://togithub.com/dbt-athena/dbt-athena/pull/643) </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zODMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjM4My4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJhdXRvbWVyZ2UiXX0=-->
Description
For some specific cases (eg. backfill very large amount of data), we need to execute parallel multiple
dbt build
of specific incremental model in which we pass the date (or batch number) as var argument.For example, we have a model we run every hour using Airflow for which we pass the a date relative to the Airflow scheduler.
If we want to process by batch of N hours in parallel using Airflow concurrency, we need the tmp table create by each of the dbt run to be unique. Else, you are going to end up with N insert attempting to run with the same
__dbt_tmp
name, creating conflict and ultimately creating failure.Similar issue open here: Tomme/dbt-athena#62
Models used to test - Optional
Checklist
Second (incremental) run model output: