-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table Metadata Lock by long populate calls #1030
Comments
Solution 1 and 3 definitely seem the most desirable. Decoding, spike sorting and LFP extraction can all potentially have long running computations, particularly with our longer recordings. |
It's not a full solution, but as a temporary measure we could also decrease the timeout time on processes to at least reduce issues of pending declare/drop exclusive locks from blocking table access |
@CBroz1 @samuelbray32 I think we need to prioritize this for at least the most problematic tables (DLC and SpikeSorting I assume). In the meantime, what options do we have to decrease the lockout time? |
Adding symptomatic error from blocked table declaration to improve future search-ability of the issue |
Describe the bug
User 1
executes a long-running populate call (e.g. spikesorting on long interval with limited compute resource)When populate begins it starts a transaction with this query in datajoint:
self.query("START TRANSACTION WITH CONSISTENT SNAPSHOT")
AnalysisNwbfile
User 2
attempts to declare a compute table with reference toAnalysisNwbfile
AnalysisNwbfile
to alter with new fk-refUser 1
's shared lockUser 3+
attempts to access data viafetch_nwb
AnalysisNwbfile
User 2
's pending lock for table declarationfetch_nwb
call stalls untilUser 2
's pending lock for table declaration times out and exits queueSymptoms
fetch_nwb
Diagnostic Tools
This error was most effective to debug on the mysql level. Useful queries provided here for future reference
Watch active sql processes
Requires sql admin privileges to view other usersGet information on processes holding/pending locks on a table
Looking Forward
This issue arose due to a particularly long (~24+ hour) populate call on a single row. However, factors that increase likelihood of such event are:
AnalysisNwbfile
in user custom tables (current `AnalysisNwbfile has ~180 children) increasing the odds of conflicting lock requestsPotential Redresses
1. Precompute results
populate
prior to calling make.populate
call in the mixin class to improve parallelization functionality Non-daemon parallel populate #1001make
function to a newcompute
functionTempAnalysisNwbfile
table.make()
would need to be passed a key to this table and would be responsible for moving the data intoAnalysisNwbfile
2. Database scheduling
3. Reduce foreign key references to a single table
AnalysisNwbfile_{pipeline}
tablesAnalysisNwbfile
a mixin class.AnalysisNwbfile
as directly, solution might avoid concerns of similar ideas forIntervalList
IntervalList
cautious insert #943AnalysisNwbfileKachery
The text was updated successfully, but these errors were encountered: