-
Notifications
You must be signed in to change notification settings - Fork 870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4 node cluster - fail to deploy DB over the network on startup #8994
Comments
Took a thread dump while transfer seems to be stuck. Is this relevant? "OrientDB SyncDatabase node=node_name db=db_name@15422" prio=5 tid=0x107 nid=NA waiting |
This issue is preventing us from deploying to production, any comment appreciated. |
13 days and no comment - should I assume no one is looking at submitted issues? Is there any other way to get help? |
Hi, Many of this problem have been fixed in a more recent version,if you can update it, this is quite likely is fixed. |
OrientDB Version: 3.0.23
Java Version: 11.0.2
OS: Windows 10
Expected behavior
4 node cluster, all nodes are master for all data, embedded in a Servlet container (Jetty).
All 4 nodes should join the cluster
Actual behavior
Failure to deploy the DB - usually the 3rd or 4th node, usually at at the 2nd chunk (out of 6).
Get the following exception on the receiving node:
java.io.EOFException: Unexpected end of ZLIB input stream
at java.base/java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:245)
at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:159)
at java.base/java.util.zip.ZipInputStream.read(ZipInputStream.java:195)
at com.orientechnologies.common.io.OIOUtils.copyStream(OIOUtils.java:205)
at com.orientechnologies.orient.core.compression.impl.OZIPCompressionUtil.extractFile(OZIPCompressionUtil.java:97)
at com.orientechnologies.orient.core.compression.impl.OZIPCompressionUtil.uncompressDirectory(OZIPCompressionUtil.java:83)
at com.orientechnologies.orient.core.storage.disk.OLocalPaginatedStorage.restore(OLocalPaginatedStorage.java:294)
at com.orientechnologies.orient.core.db.OrientDBEmbedded.restore(OrientDBEmbedded.java:418)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$6.call(ODistributedAbstractPlugin.java:1991)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$6.call(ODistributedAbstractPlugin.java:1930)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.executeInDistributedDatabaseLock(ODistributedAbstractPlugin.java:1770)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.installDatabaseOnLocalNode(ODistributedAbstractPlugin.java:1930)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.installDatabaseFromNetwork(ODistributedAbstractPlugin.java:1597)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.requestDatabaseFullSync(ODistributedAbstractPlugin.java:1418)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.requestFullDatabase(ODistributedAbstractPlugin.java:1100)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$3.call(ODistributedAbstractPlugin.java:997)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$3.call(ODistributedAbstractPlugin.java:948)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.executeInDistributedDatabaseLock(ODistributedAbstractPlugin.java:1770)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.installDatabase(ODistributedAbstractPlugin.java:947)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installNewDatabasesFromCluster(OHazelcastPlugin.java:1439)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.startup(OHazelcastPlugin.java:300)
at com.orientechnologies.orient.server.OServer.registerPlugins(OServer.java:1194)
at com.orientechnologies.orient.server.OServer.activate(OServer.java:469)
com.orientechnologies.orient.core.exception.ODatabaseException: Cannot create database 'db_name'
at com.orientechnologies.orient.core.db.OrientDBEmbedded.restore(OrientDBEmbedded.java:424)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$6.call(ODistributedAbstractPlugin.java:1991)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$6.call(ODistributedAbstractPlugin.java:1930)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.executeInDistributedDatabaseLock(ODistributedAbstractPlugin.java:1770)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.installDatabaseOnLocalNode(ODistributedAbstractPlugin.java:1930)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.installDatabaseFromNetwork(ODistributedAbstractPlugin.java:1597)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.requestDatabaseFullSync(ODistributedAbstractPlugin.java:1418)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.requestFullDatabase(ODistributedAbstractPlugin.java:1100)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$3.call(ODistributedAbstractPlugin.java:997)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$3.call(ODistributedAbstractPlugin.java:948)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.executeInDistributedDatabaseLock(ODistributedAbstractPlugin.java:1770)
at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.installDatabase(ODistributedAbstractPlugin.java:947)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installNewDatabasesFromCluster(OHazelcastPlugin.java:1439)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.startup(OHazelcastPlugin.java:300)
at com.orientechnologies.orient.server.OServer.registerPlugins(OServer.java:1194)
at com.orientechnologies.orient.server.OServer.activate(OServer.java:469)
Caused by: java.lang.RuntimeException: java.io.EOFException: Unexpected end of ZLIB input stream
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.logAndPrepareForRethrow(OAbstractPaginatedStorage.java:5918)
at com.orientechnologies.orient.core.storage.disk.OLocalPaginatedStorage.restore(OLocalPaginatedStorage.java:330)
at com.orientechnologies.orient.core.db.OrientDBEmbedded.restore(OrientDBEmbedded.java:418)
... 88 more
Caused by: java.io.EOFException: Unexpected end of ZLIB input stream
at java.base/java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:245)
at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:159)
at java.base/java.util.zip.ZipInputStream.read(ZipInputStream.java:195)
at com.orientechnologies.common.io.OIOUtils.copyStream(OIOUtils.java:205)
at com.orientechnologies.orient.core.compression.impl.OZIPCompressionUtil.extractFile(OZIPCompressionUtil.java:97)
at com.orientechnologies.orient.core.compression.impl.OZIPCompressionUtil.uncompressDirectory(OZIPCompressionUtil.java:83)
at com.orientechnologies.orient.core.storage.disk.OLocalPaginatedStorage.restore(OLocalPaginatedStorage.java:294)
... 89 more
On the sending node there is no error.
After this error, the node that sent the data stays in BACKUP status and doesn't recover. The receiving node stays in SYNCHRONIZING status and doesn't recover either.
Steps to reproduce
I can share privately the database files that reproduce the issue.
Seems that it happens more on nodes that have less processors (always with 2 processors, less frequently with 4 processors, I don't have a 4 node setup with more than 4 processors so can't tell)
The text was updated successfully, but these errors were encountered: