Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨Switch redshift staging to async mode #28619

Merged
merged 78 commits into from
Aug 14, 2023

Conversation

benmoriceau
Copy link
Contributor

@benmoriceau benmoriceau commented Jul 24, 2023

What

Switch the redshift staging to the async framework.

How

Use the createAsync method instead of the create to get an AirbyteMessage consumer. This Pr also provide the ability to specify a different Optimal flush size for the Async flush. Even if it is not needed to release readshift, it was used for some test and kept into this PR.

Recommended reading order

  1. RedshiftStaginS3Destination.java
  2. remaining files are metadata.

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user?

For connector PRs, use this section to explain which type of semantic versioning bump occurs as a result of the changes. Refer to our Semantic Versioning for Connectors guidelines for more information. Breaking changes to connectors must be documented by an Airbyte engineer (PR author, or reviewer for community PRs) by using the Breaking Change Release Playbook.

If there are breaking changes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Actions

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Connector version is set to 0.0.1
    • Dockerfile has version 0.0.1
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog with an entry for the initial version. See changelog example
    • docs/integrations/README.md

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Unit & integration tests added

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

@github-actions
Copy link
Contributor

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@benmoriceau
Copy link
Contributor Author

benmoriceau commented Jul 24, 2023

/legacy-test connector=destination-redshift

🕑 destination-redshift https://github.com/airbytehq/airbyte/actions/runs/5647455583
❌ destination-redshift https://github.com/airbytehq/airbyte/actions/runs/5647455583
🐛 https://gradle.com/s/4vsvz6oqczdm4

Build Failed

Test summary info:

Could not find result summary

@benmoriceau
Copy link
Contributor Author

benmoriceau commented Jul 26, 2023

/legacy-test connector=destination-redshift

🕑 destination-redshift https://github.com/airbytehq/airbyte/actions/runs/5672127756
❌ destination-redshift https://github.com/airbytehq/airbyte/actions/runs/5672127756
🐛

@benmoriceau benmoriceau changed the title Switch redshif to asyn mode Switch redshift to asyn mode Jul 27, 2023
@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit d3357da880) - ❌

⏲️ Total pipeline duration: 03mn03s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit 78a70dbd58) - ❌

⏲️ Total pipeline duration: 24mn23s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit 7de0a1bebc) - ❌

⏲️ Total pipeline duration: 25mn50s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit e2b54dda80) - ❌

⏲️ Total pipeline duration: 22mn29s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit 0fa168cd47) - ❌

⏲️ Total pipeline duration: 81mn12s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

final JsonNode s3Options = findS3Options(config);
final S3DestinationConfig s3Config = getS3DestinationConfig(s3Options);
final int numberOfFileBuffers = getNumberOfFileBuffers(s3Options);

if (numberOfFileBuffers > FileBuffer.SOFT_CAP_CONCURRENT_STREAM_IN_BUFFER) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also get rid of the file buffers option since the Async framework accounts for this. I think we should do so in a follow up PR and patch version bump. @edgao any preference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Edward is ok, I can do it after merging this

Copy link
Contributor

@edgao edgao Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

followup pr works for me!

…a/io/airbyte/integrations/destination/staging/AsyncFlush.java

Co-authored-by: Davin Chia <davinchia@gmail.com>
@benmoriceau benmoriceau requested review from edgao and davinchia August 14, 2023 20:25
Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving with failing checks because tests are probably just busted on CI. E.g. testIncrementalSyncWithNormalizationDropOneColumn failed here, but passed locally for benoit.

@benmoriceau
Copy link
Contributor Author

/approve-and-merge reason="local test works"

@octavia-approvington
Copy link
Contributor

What are we doing again?
merge and squash

@octavia-approvington octavia-approvington merged commit 8d19017 into master Aug 14, 2023
@octavia-approvington octavia-approvington deleted the bmoric/async-redshift branch August 14, 2023 20:53
@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit 2da179d22c) - ❌

⏲️ Total pipeline duration: 53mn27s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit 3dd8a40b3d) - ❌

⏲️ Total pipeline duration: 55mn46s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

harrytou pushed a commit to KYVENetwork/airbyte that referenced this pull request Sep 1, 2023
* Async snowflake

* Use async in destination implenentation

* Format

* Switch redshif to asyn mode

* Remove old unused consumer creation

* Add new version

* Fix non staging mode

* Change switcing to use the get serialized consumer

* Automated Commit - Format and Process Resources Changes

* Test

* Automated Commit - Format and Process Resources Changes

* Use method

* Test smaller buffer

* Test smaller buffer for redshift

* Automated Commit - Format and Process Resources Changes

* Bigger ratio

* Remove snowflake changes

* Implement the new interface

* Automated Commit - Format and Process Resources Changes

* push ratio to 0.8

* Smaller Optimal buffer size

* Automated Commit - Format and Process Resources Changes

* Bigger buffer

* Use a buffer of 10 Mb

* Use a buffer of 75 Mb

* Test reduce lib thread

* Add flags for remote profiler.

* Part size to match the async part size

* Part size to 100 Mb

* restore default

* Try with 1 thread

* Go back to default

* Clean up

* Bump version

* Restore gradle

* Re-add vm capture

* Test reduce allowed buffer size

* Use all the memory available

* only 3 threads for the lib

* Automated Commit - Format and Process Resources Changes

* test with 1

* Automated Commit - Format and Process Resources Changes

* Add local log ling.

* Do not use all RAM for heap.

* Fix build

* Clean up

* Clean up

* Update airbyte-integrations/bases/bases-destination-jdbc/src/main/java/io/airbyte/integrations/destination/staging/AsyncFlush.java

Co-authored-by: Davin Chia <davinchia@gmail.com>

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: Davin Chia <davinchia@gmail.com>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants