-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Improved robustness of concurrent schema updates #3186
Conversation
Codecov ReportPatch and project coverage have no change.
Additional details and impacted files@@ Coverage Diff @@
## develop #3186 +/- ##
========================================
Coverage 15.56% 15.56%
========================================
Files 564 564
Lines 69319 69319
Branches 681 681
========================================
Hits 10791 10791
Misses 58528 58528
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Still orienting but LGTM. One question -- theoretically race conditions are still possible as the dataset doc is not locked between reload() and the subsequent schema update (not sure exactly where that happens)? |
yes theoretically there still could be issues if two processes update the schema within the few lines of code between the I'd like us to properly address this issue as per #3187; ;this is just attempting to place a bandaid on the most likely case. |
a55af08
to
244a959
Compare
* add frame cases to dynamic label tests * embedded frame label fixes * linting
Minor typo fixes
Adding merge_sample() method
* only matches * db_field bug and path fix
* don't throw error * add debug msg
…into release/v0.21.3
* only matches * db_field bug and path fix * base image sample tests * cleanup * exclude bug * rm onlyMatch * frame and dynamic tests * adding coverage * keypoints fixes * cleanup * base image sample tests * cleanup * exclude bug * rm onlyMatch * frame and dynamic tests * adding coverage * keypoints fixes * cleanup * tweaks * exclude no only matches
Closing in favor of #3308. |
As explained in the
@todo
, this is a bit of a hack to specifically avoid issues when concurrently modifying a dataset's schema. However, the underlying problem still exists whenever list fields other thanDatasetDocument.sample_fields
andDatasetDocument.frame_fields
are concurrently edited without first reloading.I think it makes sense to go ahead and merge this particular patch because schema updates are by far the most likely case where concurrent list edits may arise, since in many workflows
dataset
objects may be held in-memory for long periods of time without reloading them. Unlike samples, which are generally only loaded + modified on-demand.