Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add async blob read and download support using multiple streams #9592

Merged

Conversation

kotwanikunal
Copy link
Member

Description

Implementation details

The request flow can be visualized using the following diagram:
Flow-MultiPart-FlowChart

Related Issues

Partially resolves #9031

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change 6a7e26c

Incompatible components

Skipped components

Compatible components

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@kotwanikunal kotwanikunal force-pushed the multipart-downloads branch 2 times, most recently from 1250157 to 62fbe0d Compare August 28, 2023 23:00
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@kotwanikunal
Copy link
Member Author

Gradle Check (Jenkins) Run Completed with:

Force pushed in quick succession.

@github-actions
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change f77ed82

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change f85700b

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"false"}}
      1 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"true"}}

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@kotwanikunal
Copy link
Member Author

The connection to jenkins went down.

@github-actions
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change 3ebea21

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.index.shard.RemoteStoreRefreshListenerTests.classMethod
      1 org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testReplicaPromotion
      1 org.opensearch.index.shard.RemoteIndexShardTests.testNRTReplicaWithRemoteStorePromotedAsPrimaryRefreshCommit

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
@github-actions
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change 39c3920

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Gradle Check (Jenkins) Run Completed with:

@kotwanikunal kotwanikunal merged commit 6765b16 into opensearch-project:main Sep 1, 2023
12 checks passed
Comment on lines +65 to +69
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) != -1) {
Channels.writeToChannel(buffer, 0, bytesRead, outputFileChannel, streamOffset);
streamOffset += bytesRead;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this isn't doing DIRECT_IO, it might thrash page cache for encrypted data transfers? Can we instead use FileChannel#transferFrom

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transferFrom does not support repositioning the channel beyond the file length. With multi threaded operation and a new file, we do not have the complete file at hand. We might have to resort using individual files and stitching logic in that case.

@kotwanikunal kotwanikunal added the backport 2.x Backport to 2.x branch label Sep 1, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 1, 2023
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
(cherry picked from commit 6765b16)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
andrross pushed a commit that referenced this pull request Sep 1, 2023
… (#9696)

(cherry picked from commit 6765b16)

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
}
}

void processFailure(Exception e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the JVM crashes before the clean up gets triggered? Can you please write an IT to ensure that this case doesn't lead to corruption?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out. Should we write to a file with a different name (i.e. add something like a ".multipart" suffix) and rename to the actual filename upon completion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an issue here: #9784

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Bukhtawar and @andrross!

kaushalmahi12 pushed a commit to kaushalmahi12/OpenSearch that referenced this pull request Sep 12, 2023
…search-project#9592)

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
brusic pushed a commit to brusic/OpenSearch that referenced this pull request Sep 25, 2023
…search-project#9592)

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Signed-off-by: Ivan Brusic <ivan.brusic@flocksafety.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…search-project#9592)

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Remote Store] [Repository Download Enhancement] Implement the enhanced download mechanism
4 participants