-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destination Azure Blob Storage: Added BufferedOutputStream to fix block count issue and improve performance #9190
Destination Azure Blob Storage: Added BufferedOutputStream to fix block count issue and improve performance #9190
Conversation
1e266f0
to
88f0311
Compare
…d printwriter to disable autoflush as well.
…is pulled for backwards compatibility. Added title for account key.
88f0311
to
ddcd67c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution @bmatticus! Just added a few comments to clarify.
cc: @marcosmarxm
airbyte-integrations/connectors/destination-azure-blob-storage/src/main/resources/spec.json
Show resolved
Hide resolved
Hi @bmatticus thanks for this improvement! @etsybaev can I ask you to review this please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bmatticus could you please bump the connector version in these files too:
airbyte-config/init/src/main/resources/seed/destination_definitions.yaml
airbyte-config/init/src/main/resources/config/STANDARD_DESTINATION_DEFINITION/b4c5d105-31fd-4817-96b6-cb923bfc04cb.json
I confirm acceptance tests are passing 👏 |
Co-authored-by: Augustin <augustin@airbyte.io>
/test connector=connectors/destination-azure-blob-storage
|
I thought that the issue may be caused due to the fact that we try to merge from the forked branch. So created the same branch, but in Airbyte repo and ran tests one more time. Link to tests run: |
Yea looking at the branch there it has the old default still. I'll double check but I did a fresh pull on my branch and tested it just a bit ago and it passed. I assume its a rollover here though still. The NPE appear to be associated with passed tests, I can only assume they may be expected without digging in further. They appear to be testing invalid keys/etc so it may be an expected behavior there. |
Actually it appears the test cases may be invalid for those, it appears some of them have had a default storage config added for tests and these have not. This is not new but I'm happy to update them here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pulled the latest code + merge with master on my side branch (#9289). I confirm acceptance tests are passing. @etsybaev let me know if @bmatticus changes look good to you and I'll publish the connector and merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from a static code check
@alafanechere approved, thanks
@misteryeo / @sherifnada, do you mind having a final review as this PR changes this connector's spec. On my side, I confirm acceptance tests are passing, and the image was published, ready to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approval specifically for the spec.json modifications, didn't look at the code
airbyte-integrations/connectors/destination-azure-blob-storage/src/main/resources/spec.json
Outdated
Show resolved
Hide resolved
…/src/main/resources/spec.json Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Thank you for your contribution @bmatticus, I published the connectors, it's now ready to use ( |
What
Fix for issue 5980, azure blob storage destination crashing after 50,000 blocks.
How
Altered PrintWriter in both CSV/JSONL to disable autoflushing and wrapped the OutputStream for blob in BufferedOutputStream as recommended by Azure. Added a setting to the spec to make this adjustable, but it will default to 100MB for backwards compatibility. Also added title to account key that was missing in current version of the spec.
Recommended reading order
x.java
y.python
🚨 User Impact 🚨
No impact expected, tested with existing connections on an instance to be sure it still worked as expected. There will be a slight memory footprint increase as the default buffer will be 100MB. This impact may vary with implementations and how many sources are being pulled at a time.
IntegrationTest Results
Pre-merge Checklist
Expand the relevant checklist and delete the others.
Updating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described here