-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Management of record locks #3631
Comments
This could be also a timeout problem. Could you explain better how it happens? |
i do not have any idea what cause about this issue. It will happen but not too often. anyway. i have upgrade orientdb to latest version. let me try the latest version and let you know if i facing this same issue again. thanks |
Maybe relative to this #3593 |
@ivca the following query cause the OTimeoutException: Can not lock record for 2000 ms. seems record is deadlocked by other record. but it wont' happen too often. anyidea? BEGIN |
checked. the issue is related to storage.record.lockTimeout. |
yes, if it fix this issue so no problems with this. |
so what is the right value to be adjusted for locktimeout. it is not too good if i set the locktimeout to a greater/max value..right? |
public int next(OrientBaseGraph graph, String class) {
String query = String.format(
"update (select from osequence where classname = '%s' ) increment value = 1 RETURN after @this lock record", class) ;
Iterable<OrientElement> iterable = graph.command(new OCommandSQL(query)).execute();
return iterable.iterator().next().getProperty("value");
} i'm gettingCaused by: com.orientechnologies.common.concur.OTimeoutException: Can not lock record for 2000 ms. seems record is deadlocked by other record if i call the above API in a multi threaded environment . I spawn a thread say 20 times and my run method is public void run() {
ograph = factory.getTx();
createGraph(oHelper.next( ograph , "Test"));
ograph.commit();
ograph.shutdown();
} the first thread gets the sequencer but the next n number of threads fail with the exception as in the bug . i need immediate solution , please help |
@anmoldeep0123 Why locking the record? |
@lvca need to lock the record or I end up in concurrent modification exception, in multi threaded environment, and we do not want to catch the exception and try the default locking behaviour, I think the optimistic locking. |
The same happened to me ( version 2.0.5). I use heavily LOCK RECORD and have never seen that error before with older OrientDB. All my queries are optimized&fast mostly by RID com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #46:916 (cluster: XXXXX)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1605)
at com.orientechnologies.orient.core.tx.OTransactionNoTx.loadRecord(OTransactionNoTx.java:80)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:1437)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:117)
at com.orientechnologies.orient.core.iterator.OIdentifiableIterator.readCurrentRecord(OIdentifiableIterator.java:278)
at com.orientechnologies.orient.core.iterator.ORecordIteratorClusters.hasNext(ORecordIteratorClusters.java:147)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.fetchFromTarget(OCommandExecutorSQLSelect.java:1300)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.executeSearch(OCommandExecutorSQLSelect.java:433)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.execute(OCommandExecutorSQLSelect.java:391)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.execute(OCommandExecutorSQLDelegate.java:64)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.executeCommand(OAbstractPaginatedStorage.java:1189)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.command(OAbstractPaginatedStorage.java:1178)
at com.orientechnologies.orient.core.sql.query.OSQLQuery.run(OSQLQuery.java:71)
at com.orientechnologies.orient.core.query.OQueryAbstract.execute(OQueryAbstract.java:33)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.query(ODatabaseDocumentTx.java:630)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLUpdate.execute(OCommandExecutorSQLUpdate.java:214)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.execute(OCommandExecutorSQLDelegate.java:64)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.executeCommand(OAbstractPaginatedStorage.java:1189)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.command(OAbstractPaginatedStorage.java:1178)
at com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.command(ONetworkProtocolBinary.java:1178)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:385)
at com.orientechnologies.orient.server.network.protocol.binary.OBinaryNetworkProtocolAbstract.execute(OBinaryNetworkProtocolAbstract.java:220)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:69)
Caused by: com.orientechnologies.common.concur.OTimeoutException: Can not lock record for 2000 ms. seems record is deadlocked by other record
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.acquireWriteLock(OAbstractPaginatedStorage.java:1293)
at com.orientechnologies.orient.core.tx.OTransactionAbstract.lockRecord(OTransactionAbstract.java:118)
at com.orientechnologies.orient.core.id.ORecordId.lock(ORecordId.java:282)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.lockRecord(OAbstractPaginatedStorage.java:1790)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.readRecord(OAbstractPaginatedStorage.java:1430)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.readRecord(OAbstractPaginatedStorage.java:694)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1572)
... 23 more
`` |
Maybe on highly concurrent clients the default value of 2 seconds it's too low. Could you try setting this value higher? Put this in last section of config/orientdb-server-config.xml file. Below an example with 10secs:
|
Ok, small update. This was a first time when I started to use UPDATE with following syntax UPDATE Customer SET FormNotificationStage = 2 RETURN AFTER LOCK RECORD WHERE FormNotificationStage=1 LIMIT 1 It gets a record from a waiting list by many remote clients (but in that case ONE). Usually I do UPDATE RID SET .... LOCK RECORD and so on. Strange but deadlocks appeared on record 0:1 ?! |
One more info. There is no records matching WHERE criteria (even no records in Customer). It's enough to get stuck with infinite lock. |
Update: orientdb> connect remote:localhost/X root root
Connecting to database [remote:localhost/X] with user 'root'...OK
orientdb {db=X}> UPDATE V SET NotExistingField = 2 RETURN AFTER LOCK RECORD WHERE NotExistingField=1 LIMIT 1
Updated record(s) '[]' in 0,751000 sec(s).
orientdb {db=X}> AFTER IT, LAUNCH CONSOLE 2 orientdb> connect remote:localhost/X root root
Connecting to database [remote:localhost/X] with user 'root'...
orientdb>
Error: com.orientechnologies.common.exception.OException: Error on creation of shared resource
Error: com.orientechnologies.orient.core.exception.ORecordNotFoundException: The record with id '#0:1' not found
Error: com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #0:1 (cluster: internal)
Error: com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #0:1 (cluster: internal)
Error: com.orientechnologies.common.concur.OTimeoutException: Can not lock record for 2000 ms. seems record is deadlocked by other record
|
I spent couple of hours analyzing record locking system. Maybe I don't fully understand it (so excuse me if I am wrong) but I strongly advice to review it. Let me share some of my concerns I hope it would be useful to build better OrientDB. 1. UPDATE with LOCK RECORD
UPDATE 10MILLION_OF_RECORD_CLASS SET NotExistingField = 2 RETURN AFTER LOCK RECORD WHERE OnlyOneRecordMatchThisExp=1 10 milions of time EX lock will be acquired. Shouldn't a shared lock be upgraded to EX when filtering matches? http://sqlblog.com/blogs/kalen_delaney/archive/2009/11/13/update-locks.aspx "While SQL Server is searching, it shouldn’t need to acquire an EXCLUSIVE lock; it only needs the EXCLUSIVE lock when the data to be changed is found" 2. EX locks are not fully released?Second my big concern is method unlockRecord Even if it's correct from some reason (but I believe it's not) it should be better commented why EX is omitted from releasing locks. Someone can blindly believe that it does no matter what kind of strategy is used. I checked it and in endResponse of binary protocol I could easily find in storage write locks not released during UPDATE LOCK RECORD operation. I believe I shouldn't find any acquired lock in locks buckets of PaginatedStorage. 3. EX lock acquired twiceDuring UPDATE operation EX lock is acquired and released twice with "kind of break" among locks. First time here within assignTarget (acquired and released) Second time here (acquired and released) I am not sure what would happen if another transaction would try to do the same interfering during break. 4. LOCKS BUCKETS in the future can generate deadlocks in transactionsOrientDB uses lock bucket based on hash of the RecordID. It's not predictable for end user to modify records in order guaranteeing deadlock avoidance. It might be a problem when locks are kept across transaction scope. 5. DEADLOCK definitionI am not sure if all issues should be classified as a deadlock. Lock acquiring timeout is much better. Since OrientDB keeps lock only for one record classical deadlock is rather not possible unless another locked internal resources are involved in different order. |
#3841 may refer to mentioned issues |
This library would be really good instead of ReentrantReadWriteLock |
Guys, If you like it I may clean it and prepare PR. |
Hi, implementation of record locks was changed in 2.0 rc1 . Before this version we had situation when UPDATE SQL expressions left some records unlocked. Could you double check now whether you still have this issue ? |
About your comments some of them really worth of considering. I mean 4 and 5. For 5-th item we are going to create lock manager to avoid deadlocks in tx. |
About items from 1 to 3 could you update links to the code so we may discuss them farther. |
About 4-th item we will use map based lock manager. |
About update lock it was not supported since 2013 do not think it is good idea to use not supported library but we may borrow idea and implement our own lock. |
The main concern is about OAbstractPaginatedStorage.unlockRecord() As you see it does nothing for all lockingStrategy different than DEFAULT. I found the place where it is released and understood why but it may be error prone in the future. It would be good to use locking mechanism which calling sequence is immune to missing lock releasing like callInLock() After fixing #3841 I didnt catch any problem like this. As far as I remember line acquired lock once for initializing iterator when UPDATE CLASS is being executed. was a second place when lock was acquired (because of Reetrant) but released only once. |
yes , you are right. such approach is fragile but we work on mechanics when if run everything in tx mode and rollback/commit of tx will unlock all locked records. |
Fixed in issue #3641 . |
I'm very regret that we still have this issue for version 2.1-rc2. can you guys fix this issue again |
which one ? |
more precisely in which circumstances you have this issue . Because according to @kowalot this issue is fixed. |
the issue is still the same for #3641 that i have lock timeout and com.orientechnologies.common.concur.lock.OLockException: Cannot unlock a never acquired lock. thanks. |
FYI. make sure you try to run your script at parallel way like what i have mentioned in #3641. I bet you will have this issue if you run any update/select query in parallel. |
do you have any idea about following issue?
caused by: com.orientechnologies.common.concur.OTimeoutException: Can not lock record for 2000 ms. seems record is deadlocked by other record
The text was updated successfully, but these errors were encountered: