Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEADLOCK in distributed Mode after: Found null entry in ridbag with rid=#245:-13 #9081

Closed
jonsalvas opened this issue Dec 3, 2019 · 1 comment
Assignees
Labels
Milestone

Comments

@jonsalvas
Copy link

OrientDB Version: 3.0.25

Java Version: OrientDB Docker Image

OS: OrientDB Docker Image

Expected behavior

Update vertices and its edges as usual in distributed mode

Actual behavior

  1. OConcurrentModificationExceptions are thrown for the record Error occurs call OSQLQuery method with the query string that starts with white space. [moved] #119:2: com.orientechnologies.orient.core.exception.OConcurrentModificationException: Cannot UPDATE the record Error occurs call OSQLQuery method with the query string that starts with white space. [moved] #119:2 because the version is not the latest. Probably you are updating an old record or it has been modified by another user (db=v73 your=v72)
  2. Our retry logic retries the transaction 5 times. After retrying 2 times we receive:
    [OHazelcastPlugin]Exception 2FE1AD2D in storage plocal:/orientdb/databases/xxx: 3.0.25 - Veloce (build 2a229d5, branch UNKNOWN)
    com.orientechnologies.orient.core.exception.OSerializationException: Found null entry in ridbag with rid=Index failed if multi database is opened [moved] #245:-13
    DB name="xxx"
  3. From then on the database is completely blocked. Even if we cut the load from the database and only insert one single vertex, we receive:
    com.orientechnologies.orient.server.distributed.task.ODistributedOperationException: quorum of '3' not reached, responses: [node: xxx-odb-orientdb-0 success,concurrent modification record (node xxx-odb-orientdb-2): Error occurs call OSQLQuery method with the query string that starts with white space. [moved] #119:2 database version: 109 transaction version: 108,concurrent modification record (node xxx-odb-orientdb-1): Error occurs call OSQLQuery method with the query string that starts with white space. [moved] #119:2 database version: 109 transaction version: 108]

Steps to reproduce

I don't have a minimal example to reproduce the issue, but I can reproduce it on our database. I can provide a heapdump or stacktrace of the database in the blocked state if required. Please let me know if you need more info.

@jonsalvas jonsalvas changed the title OrientDB in distributed Mode: Found null entry in ridbag with rid=#245:-13 DEADLOCK in distributed Mode: Found null entry in ridbag with rid=#245:-13 Dec 3, 2019
@jonsalvas jonsalvas changed the title DEADLOCK in distributed Mode: Found null entry in ridbag with rid=#245:-13 DEADLOCK in distributed Mode after: Found null entry in ridbag with rid=#245:-13 Dec 3, 2019
@wolf4ood
Copy link
Member

Hi @jonsalvas

yes the dump and stack trace would be helpful as well as the dump of the record which give the exception like in this case #119:2 taken from all the nodes.

you can send them to e.risa@sap.com.

Thanks

@tglman tglman self-assigned this Jun 1, 2020
@tglman tglman added the bug label Jun 1, 2020
@tglman tglman added this to the 3.0.x milestone Jun 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants