-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swarm Mode #28
Swarm Mode #28
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although broadly on the right path, I'm afraid I have various concerns about this approach: it carries security risks, particularly with the use of a single network rather than two; it won't detect cluster failure, particularly important when operating a multi-server Docker Swarm; it uses global strategy, which would lead to multiple conflicting containers being started and corrupting data on a multi-server Swarm; much of the approach can be simplified by use of docker stack deploy
. I realise that the problem you were having with healthchecks was almost certainly because of how Swarm routes traffic: for healthchecked services, containers are only routable through the overlay network once they have passing healthcheck, meaning that the cluster can never stablise and bootstrap properly (in contract to Docker Compose, which by default operates differently).
I've put together an example Docker Swarm config which should solve all these problems. Note, however, that it has certain caveats, including: the subnets allocated to the 2 networks is critical (db_a
before db_b
); the node executed on must be a manager
; all containers are assumed to live on a single host (since there is no docker service exec
); a second deployment is needed in order to restore the healthchecks after the initialisation (although this could be worked around); the node is expected to be tagged with grp=dbxl
(e.g. docker node update --label-add grp=dbxl
). As I say, it's not possible to give a single deployment with works for everyone (not least because there's no requirements to run 2 Coordinators and 2 Datanodes; you could easily run 3 Coordinators and 8 Datanodes, using these images, or something else entirely). However, it should be enough to get you started (since you can simply adapt the new bin/init-eg-swarm
script and docker-compose.swarm.yml
), and should overcome the various risks above. I was successful just now bootstrapping a Postgres-XL cluster on Swarm using Docker 18.09.7, including with healthchecks, and the two networks, with secure defaults.
Please see 9f5f61c
, and let me know how you get on.
CONTAINER_ID=$(${DOCKER_CMD} inspect --format '{{ .Status.ContainerStatus.ContainerID }}' $TASK_ID) | ||
TASK_NAME=swarm_exec_${RANDOM} | ||
|
||
TASK_ID=$(${DOCKER_CMD} service create \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this is necessary in recent Docker versions, since you can use Docker Stack, e.g. docker stack deploy tmp.yml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK great thank you I will try it now I didn't realise there was a better way now. I found this code in this thread https://www.reddit.com/r/docker/comments/a5kbte/run_docker_exec_over_a_docker_swarm/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method in the Reddit link requires that the Docker Engine is accessible remotely via a port. This isn't the default, and indeed is rather dangerous, unless secured very carefully. The default is to bind to a socket; hence, I'm pretty sure this wouldn't work.
--detach \ | ||
--name=${TASK_NAME} \ | ||
--restart-condition=none \ | ||
--mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. Why are you binding the Docker socket? Do you not have Docker installed and accessible in the host? If so, no socket-binding should be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It runs a creates a temporary container on a foreign host. This was the workaround I have used before to exec into a container on a foreign swarm node. As I wasn't aware there is a better way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, interesting… I'll try to think about this more carefully at some point. I'll admit I haven't really had much use for execution of temporary commands on remote Swarm nodes, so far; usually, I just control everything from manager nodes, and make simple services or even images for anything else. Given your explanation, it might well be that this method is fine. Certainly, it would relax some of the caveats in my own script—at the expense of loss of immediate status feedback, and indeed of guaranteeing the commands are even executed. I'll admit, when I prepare Postgres-XL on a Swarm, I don't use this method at all; I simply paste in the SQL clustering commands manually, after checking the pg_hba.conf
files on the Coordinators and Datanodes. This round of work we've been doing is nice, though, in supplying automated setup examples, so I am pleased you asked. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another method, of course, would be to override the bootstrapping entrypoints with your own setup code. If you placed your files to be executed in Docker configs, then they would get auto-distributed to the nodes, even for remote worker nodes. I feel this is a little out-of-scope for an example, though (although it likely wouldn't be too hard).
done | ||
|
||
${DOCKER_CMD} service logs --raw ${TASK_ID} | ||
${DOCKER_CMD} service rm ${TASK_ID} > /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not actually necessary to destroy the service, but admittedly it depends on the approach.
- db_gtm_1:/var/lib/postgresql | ||
networks: | ||
- db | ||
# healthcheck: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much recommend running with healthchecks; Postgres-XL doesn't always detect unhealthy clusters very well (at least, this used to be the case a year or so ago), and it's possible for a cluster to seem up and healthy, but to fail. The recent healthchecks work I did detects and handles this automatically, restarting nodes within a Postgres-XL cluster until it becomes stable.
# healthcheck: | ||
# test: ["CMD", "docker-healthcheck-gtm"] | ||
deploy: | ||
mode: global |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right, to me. If you have a multi-node Swarm cluster, this will cause multiple deployments of the services, and since they require the data directory and can only run one copy of the service at once, it will almost certainly cause data corruption and a very broken cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. I misunderstood global mode. All I was trying to limit replicas to be safe but realise now --replicas=1 would do what I was expecint global too. I have only tested it on a single node swarm currently but will on a multi node one as soon as it's working properly on one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--replicas=1
would indeed do what you expect—however, it's the default, so in fact you don't need it. You don't actually need the constraints section at all; I just include it because I'm presuming you're running non-trivial clusters (i.e. greater than 1-node). However, if I were actually to use this, I'd likely change the constraints to constrain to dbxl_coord_1
etc., so the containers would only ever be launched on nodes containing the data volume. Usually, this would constrain each container to be on a specific node, although of course if you had shared storage, it could also allow for safe failover of a specific Coordinator or Datanode still whilst assuming the 1-replica. Again, I think this is likely a bit out-of-scope, for this, especially as it would require you having a backend shared storage solution configured separately. Perfectly possible, though.
- PG_HOST=0.0.0.0 | ||
- PG_NODE=data_2 | ||
- PG_PORT=5432 | ||
image: pavouk0/postgres-xl:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember latest
is for testing, only; production should ping a specific tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks will do.
image: pavouk0/postgres-xl:latest | ||
command: docker-cmd-data | ||
entrypoint: docker-entrypoint-data | ||
depends_on: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if you used Stack deploy, this would be ignored.
db_data_1: {} | ||
db_data_2: {} | ||
networks: | ||
db: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates a huge security risk. Since the Postgres-XL cluster doesn't support authentication for inter-node communictions, it needs to use the trust
in pg_hba.conf
. But using the same network for both cluster communication and access into the cluster from outside means that any roles and passwords you set will be silently ignored, and anything will be allowed direct access from anywhere. Not only that, but this create a cluster-corruption risk, since non-Postgres-XL services could talk directly to the Datanodes, rather than being forced to go through the Coordinators. This in turn means that Postgres-XL would not maintain its metadata, and the cluster would be highly like to corrupt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I didn't realise this. That is a big problem. Thanks for the clarification.
networks: | ||
db: | ||
driver: overlay | ||
attachable: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for attachable
, unless you're deploying as a standalone cluster that other Swarm services connect to. If attachable is desired, make sure it's for the Coordinator network only (db_b
), not the inter-cluster network (db_a
). These networks are named in order, since in fact because of how Docker Swarm sets up default routes within the container, another order will result in db_b
being the default for Coordinators, leading to an authentication failure since they're not routing through the trust
backend network. This can also happen even with Stack deploy, in some cases—safest is in fact to create the networks manually, or to check the networks to ensure that db_a
has a lower IP subnet than db_b
(given how the default healthchecks work, which rely on this, so as no longer to require variables passing in the network values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you correct with this. I was just using attachable as I have another container that i was running psql from to test but I definitely wouldn't run it like this production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attachable
is fine, if this is your use-case. But it's not necessary if you have services launched in the same stack as your Postgres-XL cluster. If you'd rather configure separately, though, or are setting up a single Postgres-XL cluster for multiple services, then attachable
is fine—or perhaps even publishing ports, if you're sure you're behind a properly-restricted firewall (or perhaps have multiple NICs, even). In the latter case, however, be very careful to only expose the Coordinator-only (db_b
) overlay network; otherwise, you'll grant public access without any authentication, like noted above for pg_hba.conf
trust
vs md5
(if you're doing this, I highly recommend you test it carefully, too, before going live, in case there's a bug in this program somewhere—no warranty, etc. etc. etc. :) ).
Closing as part of #27 . |
Hi,
So I've been running some tests on creating it using a stack like we discussed in #27. I have managed to get it working and when I manually exec the containers they are working. However the healthchecks do not seem to work I will have a look again but just so you can test it if possible at some point here is how to run it.
run
docker stack deploy --compose-file docker-compose.image.swarm.yml postgresxl
then run
init-eg-swarm postgresxl
I was having issues with two overlay networks so am just using one for the time being.