Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import operations failing in distributed mode #9073

Closed
tacho opened this issue Nov 25, 2019 · 5 comments
Closed

Import operations failing in distributed mode #9073

tacho opened this issue Nov 25, 2019 · 5 comments
Milestone

Comments

@tacho
Copy link

tacho commented Nov 25, 2019

OrientDB Version: 3.0.25

Java Version: 11.0.5+10-LTS

OS: CentOS Linux 7.7.1908

Expected behavior

When running a 3 node distributed cluster with one master node and two replicas, everything should work fine.

Actual behavior

Import operations fail with
Code: 500, Content: com.orientechnologies.orient.server.distributed.ODistributedException: Quorum (1) cannot be reached on server 'odb-staging0' database 'TestDB' because it is major than available nodes (0)

Some other operations such as ALTER CLASS also randomly fail, and succeed every time in single node mode.

The server status is:

Nov 25 07:07:37 odb-staging0 server.sh[24944]: +-----+------+------------------+----------------+----------------+-------------------+--------------------+-------------------------+
Nov 25 07:07:37 odb-staging0 server.sh[24944]: |Conns|Status|Name              |Binary          |HTTP            |StartedOn          |UsedMemory          |Databases                |
Nov 25 07:07:37 odb-staging0 server.sh[24944]: +-----+------+------------------+----------------+----------------+-------------------+--------------------+-------------------------+
Nov 25 07:07:37 odb-staging0 server.sh[24944]: |4    |ONLINE|odb-staging1      |10.165.195.21...|10.165.195.21...|2019-11-24 19:38...|720.09MB/22.00GB ...|TestDB=ONLINE (REPLICA)  |
Nov 25 07:07:37 odb-staging0 server.sh[24944]: |4    |ONLINE|odb-staging2      |10.165.195.22...|10.165.195.22...|2019-11-24 19:40...|3.56GB/22.00GB (1...|TestDB=ONLINE (REPLICA)  |
Nov 25 07:07:37 odb-staging0 server.sh[24944]: |5    |ONLINE|odb-staging0(*)(@)|10.165.195.20...|10.165.195.20...|2019-11-24 19:31...|2.81GB/22.00GB (1...|TestDB=ONLINE (MASTER)   |
Nov 25 07:07:37 odb-staging0 server.sh[24944]: +-----+------+----------------+----------------+----------------+-------------------+--------------------+---------------------------+

Steps to reproduce

[root@odb-staging0 config]# cat default-distributed-db-config.json
{
  "autoDeploy": true,
  "readQuorum": 1,
  "writeQuorum": 1, // Also tried with "majority"
  "executionMode": "undefined",
  "readYourWrites": true,
  "newNodeStrategy": "static", // Also tried with "dynamic"
  "servers": {
    "odb-staging0": "master",
    "odb-staging1": "replica",
    "odb-staging2": "replica"
  },
  "clusters": {
    "internal": {
    },
    "*": {
      "servers": ["<NEW_NODE>"]
    }
  }
}
[root@odb-staging0 config]#
       <handler class="com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin">
            <parameters>
                <parameter value="odb-staging0" name="nodeName"/>
                <parameter value="true" name="enabled"/>
                <parameter value="${ORIENTDB_HOME}/config/default-distributed-db-config.json" name="configuration.db.default"/>
                <parameter value="${ORIENTDB_HOME}/config/hazelcast.xml" name="configuration.hazelcast"/>
            </parameters>
        </handler>
		<port auto-increment="false">2434</port>
		<join>
			<multicast enabled="false">
			</multicast>
			<tcp-ip enabled="true">
				<member>10.165.195.20</member>
				<member>10.165.195.21</member>
				<member>10.165.195.22</member>
			</tcp-ip>
		</join>
@tglman
Copy link
Member

tglman commented Dec 10, 2019

Hi @tacho,

Do you have any error in the server log when you run the import ? can you share with us the sever log, as well if you have a specific import that reproduce the problem that you can share will be cool.

For the alter table problem can you open a separate issue where you specify the kind of alter you are trying to do (the SQL query would be good).

This current info are good but i can just identify the environment not the problem.

Regards

@tacho
Copy link
Author

tacho commented Dec 10, 2019

Hello,

I'll have to reprovision the test cluster that I had in order to give you server logs, but if I recall correctly there weren't any messages relating to this there, just the client error that I pasted above.

Any database import was failing, but I'll try to prepare a small one as a test case.

@tacho
Copy link
Author

tacho commented Dec 11, 2019

odb-gh-9073.tar.gz

Here's an archive with the server logs, a sample DB and all three nodes' configurations.

The password hashes inside are with obviously insecure passwords, as this is a disposable test cluster.

As you'll see there wasn't any error in the server w.r.t the database import, and console.sh gave me this:

Connecting to database [remote:10.184.243.3/testdb1] with user 'admin'...OK
orientdb {db=testdb1}> import database /Users/tacho/Downloads/testdb1.gz

Importing database database /Users/tacho/Downloads/testdb1.gz...
Error: com.orientechnologies.orient.core.exception.OStorageException: Error sending import request
	DB name="testdb1"

Error: com.orientechnologies.orient.core.db.tool.ODatabaseExportException: Error on importing database 'testdb1' from file: /tmp/import15340545639861068269testdb1.gz

Error: com.orientechnologies.orient.server.distributed.ODistributedException: Quorum (1) cannot be reached on server 'odb-rpltest0' database 'testdb1' because it is major than available nodes (0)

orientdb {db=testdb1}>

@tacho
Copy link
Author

tacho commented Dec 11, 2019

For the record, the logs are from 3.0.25, but I just upgraded the test "cluster" to 3.0.26 and same error occurs.

@tglman
Copy link
Member

tglman commented Dec 13, 2019

Hi @tacho,

Thank you for the details, I could reproduce and fix it, it will be released in the next hotfix 3.0.27.

bye

@tglman tglman added this to the 3.0.27 milestone Jan 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants