Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support load balancing between several tarantool instances #246

Open
Totktonada opened this issue Nov 20, 2019 · 1 comment
Open

Support load balancing between several tarantool instances #246

Totktonada opened this issue Nov 20, 2019 · 1 comment
Labels
enhancement feature A new functionality

Comments

@Totktonada
Copy link
Member

Now we have the cluster client that supports failover, but it does not allow to balance a load between several tarantool instances. Users often ask for this, however.

We however provide some guarantees: say, that sync requests will be executed in order. Or that JDBC's batch updates (as well as async requests) will be in order under some circumstances: mainly it is using of memtx engine and don't have DDL requests in a batch.

We should not break those guarantees for existing singletone and cluster client: at least we should do this under a certain option and cleanly state which kinds of assumptions will fails when requests are balanced over instances.

@Totktonada Totktonada added enhancement feature A new functionality labels Nov 20, 2019
@Totktonada
Copy link
Member Author

Raw idea, which is based on what @nicktorwald shares about postgresql:

For example an application can create two connection pools. One data source is for writes, another for reads. The write pool limits connections only to master node:

jdbc:postgresql://node1,node2,node3/accounting?targetServerType=master.

And read pool balances connections between slaves nodes, but allows connections also to master if no slaves are available:

jdbc:postgresql://node1,node2,node3/accounting?targetServerType=preferSlave&loadBalanceHosts=true

If a slave fails, all slaves in the list will be tried first. If the case that there are no available slaves the master will be tried. If all of the servers are marked as "can't connect" in the cache then an attempt will be made to connect to all of the hosts in the URL in order.

Cluster discovery contract can be extended to provide information which instances are ready to write requests and which are not (using the second return value). An option (in jdbc: URI or in configuration) can control whether a particular connection instance supports write requests. If we'll get ER_READONLY, then we should trigger cluster discovery and retry the request if it appears that a current instance becomes read-only, but another one becomes writeable. Maybe after read-only session support there will be another way to detect this case.

Open questions:

  • How to differentiate application-level 'writes' and 'reads', when they are just calls, to send them to a particular instance? Like in vshard: callro, callrw methods? BTW, JDBC provides different methods to execute queries and all other requests (DML / DDL).
  • How to give an error like ER_READONLY from an application code when a 'write' call/eval request received on a read-only instance: just use box.error(box.error.READONLY) or allow to register a user define error in the connector (see also this issue re application level errors)?
  • How writes should be balanced? Should be balanced at all? Should they be sent in a batch when consistency is matter? So we should implement batch API for native (non-jdbc) cluster client?
  • The same about consistency of reads.

This is just random thoughts around: written here to don't forget them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature A new functionality
Projects
None yet
Development

No branches or pull requests

1 participant