-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Refactor Normalization docker images and upgrade to use dbt 0.21.0 #6959
Conversation
airbyte-workers/src/main/java/io/airbyte/workers/normalization/NormalizationRunnerFactory.java
Outdated
Show resolved
Hide resolved
I'm also trying to fix Oracle integration tests (they are currently disabled)... |
airbyte-integrations/bases/base-normalization/integration_tests/test_normalization.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, lgtm! I think this makes a lot of sense given different dependencies between dests and the improved dx will be 👌
What's the reasoning behind the choices of mssql, mysql & oracle to be the ones to separate out? Is intended goal to move all destinations toward this structure within normalization?
@@ -90,9 +93,11 @@ if(!System.getenv().containsKey("SUB_BUILD") || System.getenv().get("SUB_BUILD") | |||
include ':airbyte-integrations:connectors:destination-oracle' | |||
include ':airbyte-integrations:connectors:destination-mssql' | |||
|
|||
//Needed by destination-bugquery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to think this was on purpose 😆
if DestinationType.POSTGRES.value not in destinations_to_test: | ||
destinations_to_test.append(DestinationType.POSTGRES.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the reason for enforcing postgres like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some test functions in that class are explicitly requiring postgres destination (but omit tests on other destinations)
airbyte/airbyte-integrations/bases/base-normalization/integration_tests/test_ephemeral.py
Line 80 in 62826f8
run_test(DestinationType.POSTGRES, 1) |
normalization for "regular" destinations is based on dbt's core "mono-repo" so we don't actually need to produce multiple docker images there, the code base is the same with the macros making the slight tweaks in syntax. Unfortunately for mssql, mysql and oracle, those are community contributed adapters to dbt, so they each live in a separate git repo and have their own dbt version dependency which may be different from each other, and not always in sync with core dbt...
|
/test connector=connectors/destination-postgres
|
/test connector=bases/base-normalization
|
…irbytehq#6959) * Split normalization docker images for some connectors with specifics dependencies * Regenerate (airbytehq#7003)
What
How
Describe the solution
Closes #2054
Closes #6872
Recommended reading order
Build related changes:
airbyte-workers/src/main/java/io/airbyte/workers/normalization/NormalizationRunnerFactory.java
airbyte-workers/src/main/java/io/airbyte/workers/normalization/DefaultNormalizationRunner.java
airbyte-integrations/bases/base-normalization/build.gradle
settings.gradle
docker-compose.build.yaml
airbyte-integrations/bases/base-normalization/Dockerfile
DX with normalization tests:
airbyte-integrations/bases/base-normalization/README.md
airbyte-integrations/bases/base-normalization/integration_tests/dbt_integration_test.py
airbyte-integrations/bases/base-normalization/integration_tests/test_normalization.py
airbyte-integrations/bases/base-normalization/integration_tests/test_ephemeral.py
Context
Docker images for normalization
We currently have additional packages to install inside normalization docker image due to:
Some of these destinations have a hard dependency on a certain dbt version (0.19.0) and can't be upgraded to more recent releases.
However, we do want to support the latest dbt versions as they come.
As the number of supported destinations for normalization grows, we may run into dbt versions conflicts.
So this PR produces a dedicated docker image for each "custom" destination (outside of dbt core support).
Docker image version
Instead of publishing a new docker image at each normalization PR, we can publish normalization images when releasing Airbyte versions. (additionally, we can run normalization in dev mode without change to codebase now)
Integration tests stability & useability
When running integration tests, we noticed flakiness in results. Maube due to concurrent runs by multiple people/CI jobs.
So a random schema where to run tests should better isolate tests.
It is now easier to select which part of normalization tests to run per destination via an env variable