-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load ArraySchema in parallel to listing fragments #2061
Merged
Shelnutt2
merged 1 commit into
dev
from
sethshelnutt/ch5118/parallelize-loading-array-schema-and-listing
Jan 29, 2021
Merged
Load ArraySchema in parallel to listing fragments #2061
Shelnutt2
merged 1 commit into
dev
from
sethshelnutt/ch5118/parallelize-loading-array-schema-and-listing
Jan 29, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This pull request has been linked to Clubhouse Story #5118: Parallelize loading array schema and listing of fragments/loading of fragment metadata. |
26d69c5
to
cd55764
Compare
joe-maley
reviewed
Jan 29, 2021
cd55764
to
213fe27
Compare
For `array_open_for_reads` we can list the fragments in parallel to loading the array schema. Listing the fragments and loading the fragment metadata does not require the array schema. Loading everything in parallel can save 100-300 milliseconds in the open time for S3 based arrays.
213fe27
to
1246e41
Compare
ihnorton
approved these changes
Jan 29, 2021
joe-maley
approved these changes
Jan 29, 2021
ihnorton
added a commit
that referenced
this pull request
Feb 5, 2021
Invert the tasking order for async loading established in #2061 Now load the fragments in a task and array schema on main. Fixes several mutex ownership issues which surfaced on Windows. (1) We need to wait on the async task in all return clauses to avoid the following situation which causes an exit on Windows: if the context owning the storage manager is destroyed without joining the async thread, then we get an error on Windows due to attempting to destruct a locked mutex. (2) We cannot call open_array->mtx_unlock() on the main thread after running the array_open call async on the thread pool (which locks the open_array.mtx_), because lock/unlock on different threads is undefined behavior and a runtime error on windows.
ihnorton
added a commit
that referenced
this pull request
Feb 5, 2021
Invert the tasking order for async loading established in #2061 Now load the fragments in a task and array schema on main. Fixes several mutex ownership issues which surfaced on Windows. (1) We need to wait on the async task in all return clauses to avoid the following situation which causes an exit on Windows: if the context owning the storage manager is destroyed without joining the async thread, then we get an error on Windows due to attempting to destruct a locked mutex. (2) We cannot call open_array->mtx_unlock() on the main thread after running the array_open call async on the thread pool (which locks the open_array.mtx_), because lock/unlock on different threads is undefined behavior and a runtime error on windows.
ihnorton
added a commit
to ihnorton/TileDB
that referenced
this pull request
Feb 5, 2021
Invert the tasking order for async loading established in TileDB-Inc#2061 Now load the fragments in a task and array schema on main. Fixes several mutex ownership issues which surfaced on Windows. (1) We need to wait on the async task in all return clauses to avoid the following situation which causes an exit on Windows: if the context owning the storage manager is destroyed without joining the async thread, then we get an error on Windows due to attempting to destruct a locked mutex. (2) We cannot call open_array->mtx_unlock() on the main thread after running the array_open call async on the thread pool (which locks the open_array.mtx_), because lock/unlock on different threads is undefined behavior and a runtime error on windows.
joe-maley
pushed a commit
that referenced
this pull request
Feb 8, 2021
* Add test for array open and free with invalid URI * sm::array_open_for_reads: fetch fragments async instead of schema Invert the tasking order for async loading established in #2061 Now load the fragments in a task and array schema on main. Fixes several mutex ownership issues which surfaced on Windows. (1) We need to wait on the async task in all return clauses to avoid the following situation which causes an exit on Windows: if the context owning the storage manager is destroyed without joining the async thread, then we get an error on Windows due to attempting to destruct a locked mutex. (2) We cannot call open_array->mtx_unlock() on the main thread after running the array_open call async on the thread pool (which locks the open_array.mtx_), because lock/unlock on different threads is undefined behavior and a runtime error on windows.
github-actions bot
pushed a commit
that referenced
this pull request
Feb 8, 2021
* Add test for array open and free with invalid URI * sm::array_open_for_reads: fetch fragments async instead of schema Invert the tasking order for async loading established in #2061 Now load the fragments in a task and array schema on main. Fixes several mutex ownership issues which surfaced on Windows. (1) We need to wait on the async task in all return clauses to avoid the following situation which causes an exit on Windows: if the context owning the storage manager is destroyed without joining the async thread, then we get an error on Windows due to attempting to destruct a locked mutex. (2) We cannot call open_array->mtx_unlock() on the main thread after running the array_open call async on the thread pool (which locks the open_array.mtx_), because lock/unlock on different threads is undefined behavior and a runtime error on windows.
joe-maley
pushed a commit
that referenced
this pull request
Feb 8, 2021
* Add test for array open and free with invalid URI * sm::array_open_for_reads: fetch fragments async instead of schema Invert the tasking order for async loading established in #2061 Now load the fragments in a task and array schema on main. Fixes several mutex ownership issues which surfaced on Windows. (1) We need to wait on the async task in all return clauses to avoid the following situation which causes an exit on Windows: if the context owning the storage manager is destroyed without joining the async thread, then we get an error on Windows due to attempting to destruct a locked mutex. (2) We cannot call open_array->mtx_unlock() on the main thread after running the array_open call async on the thread pool (which locks the open_array.mtx_), because lock/unlock on different threads is undefined behavior and a runtime error on windows. Co-authored-by: Isaiah Norton <ihnorton@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For
array_open_for_reads
we can list the fragments in parallel to loading the array schema. Listing the fragments and loading the fragment metadata does not require the array schema. Loading everything in parallel can save 100-300 milliseconds in the open time for S3 based arrays.