-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impossible to add a second node to the Stolon cluster [SOLVED] #168
Comments
@afiskon I need the logs on the leader sentinel (the one on archlinux1). For now I can guess that the leader sentinel cannot talk with the keeper on the second node (default on port 5431) (firewalled?). |
@sgotti OK, it was my mistake, not a bug. I forgot to specify I would like to ask you a few more questions:
Unfortunately currently there is no answer on these questions in the documentation. Perhaps you could create a FAQ.md file or something like this. |
@afiskon Glad it worked!
Currently the proxy redirects all requests to the master. Its primary scope is to avoid connections to partitioned master and does this just passing the data to it (it's just a layer4 proxy, no content inspection is done). #132 is a feature request for using the proxy also for standbys but it's low in the priority list.
I'm not sure if I got your question right. Currently you can only have async or sync standbys per cluster and not a mix of them.
It depends on your architecture and where the different stolon cluster are located. A good choice is to connect to a consul kv store placed in the same "location" of your stolon cluster. So if you have multiple stolon cluster in the same "location" you can use only one store (the path is based on the stolon cluster name so they won't conflict). Obvisouly if your store goes down all the clusters will be affected.
Yes we are lacking some doc (there're also some upcoming changes that will change a lot of things to add additional features so the current doc will be reworked a bit). |
@sgotti thanks a lot for your reply!
It's not a big problem in practice since [ see https://github.com//issues/132#issuecomment-255372997 ]. What actually leads me to one more little question - by any chance does Stolon use Consul as DNS server as well? Unfortunately I didn't found a way to list all DNS records stored in Consul to figure it out by myself.
OK, basically sync mode means:
According to PostgreSQL documentation ( https://www.postgresql.org/docs/9.5/static/runtime-config-replication.html ) :
Lets say postgres1 dies, postgres2 is synchronous standby and postgres3 is a bit out of sync (doesn't has a few latest WAL records). Does Stolon guarantee that postgres2 will be a new master or there is 50% change that it will be postgres2 and 50% change - that postgres3?
Well I think you could just copy-paste this discussion to FAQ section :) IMO it's very important details everyone should be aware of. Or I'll just send a corresponding PR a bit letter. |
Currently stolon doesn't need to resolve any dns name for internal communication (sentinel -> keeper), (stolonctl->sentinel) but the communication style will probably change. Resolving name for connecting to a store (like etcd) are delegated to the client and so to the default go resolver (depens if compiled with or without cgo) Or did you mean registering a service in consul? In this case the store (etcd/consul) is only used as a k/v store.
This is automatically handled by stolon if you set synchronous replication https://github.com/sorintlab/stolon/blob/master/doc/syncrepl.md
Currently it tries to find the best standby, the one with the xlog location nearest to the master latest knows xlog location. If a master is down there's no way to know its latest xlog position (stolon get and save it at some intervals) so there's no way to guarantee that the standby is not behind but just that the best standby of the ones available will be choosed. An option that could be added in future (didn't had time) will be to specify a maximum lag. But I think it's quite impossible to guarantee no out of sync in some situations. |
It's impossible in PostgreSQL <= 9.5, but in 9.6 replication to the quorum was added (see 9.6 docs for synchronous_standby_names). It guarantees that in case of netsplit there is at least one synchronous standby among nodes in the majority. (Naturally nothing will help in case when cluster splits in three equal parts, i.e. double netsplit. To handle this you need a so called AP solution, like Riak, so it's not our case). It would be very nice to have a quorum synchronous replication support in Stolon. Lack of it basically means that sometimes Stolon can loose a few recent changes. Thanks again for your insightful answers! |
Stolon version: 0.3.0 (same issue on master branch) - built from source
Consul version: 0.7.0 (default Arch Linux package)
PostgreSQL version: 9.5.4 (default Arch Linux package)
Go version: 1.7.1 (default Arch Linux package)
I've created 3 virtual machines using Virtual Box. VMs are in 10.0.3.0/24 network - 10.0.3.7 (archlinux1), 10.0.3.8 (archlinux2) and 10.0.3.9 (the last one is not used below).
Steps to reproduce.
On archlinux1:
Single node configuration works as expected.
On archlinux2:
sentinel output:
keeper output:
keeper --debug output: http://afiskon.ru/s/21/a5c6ef5ab5_keeper.txt
sentinel --debug output: http://afiskon.ru/s/96/5affa784dc_sentinel.txt
Any advice?
The text was updated successfully, but these errors were encountered: