Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gzclient seems to connect to server incorrectly when running multiple servers on multiple master URI #3153

Closed
VeerachartZMP opened this issue Dec 28, 2021 · 5 comments

Comments

@VeerachartZMP
Copy link

Symptoms

When starting multiple gazebo instance using different GAZEBO_MASTER_URI, I noticed that the client of the second gazebo usually shows the visuals from the first gazebo (or gzserver)

My system

Ubuntu 20.04, Gazebo 11.9.1 (installed by deb package), Nvidia GTX 1070

How to check

I made these worlds with the same ray sensor and camera sensor, and a cube. The only difference in the worlds is the distance of the cube from the sensors: in server11345.world the closest face of the cube to the sensor is 2 m, and it is 4 m in server11346.world. The scripts are for setting up GAZEBO_MASTER_URI and starting each process.
Steps (run each in one terminal):
multi_server_problem.zip

  1. ./run_11345.sh -- this starts a server with server11345.world at http://localhost:11345
  2. ./run_11346.sh -- this starts a server with server11346.world at http://localhost:11346
  3. ./echo_11345.sh -- this echos the ray sensor's topic on http://localhost:11345, and the distance should print ~2 m.
  4. ./echo_11346.sh -- this echos the ray sensor's topic on http://localhost:11346, and the distance should print ~4 m.
  5. ./client_11345.sh -- this starts a gzclient on http://localhost:11345 and should connect to server11345.world
  6. ./client_11346.sh -- this starts a gzclient on http://localhost:11346 and should connect to server11346.world
    The worlds also save the camera images into images_11345 and images_11346, respectively (at 1 Hz). Don't forget to stop the server to prevent saving too many images.

Outcomes
The echos from steps 3 and 4 should be correct, and the images should also be different in the two folders, showing that the worlds are correct on the server side. However the clients will show the visual of the box in both worlds at 2.5 m (the closest face at 2 m from origin), with the texts of the poses showing 2.5 m in 11345 world and 4.5 in 11346 world. The visualization of the rays should be correct in both clients, meaning that in one client, the rays go beyond the box. Also it looks like there is no difference in whether server 11345 or 11346 is started first, the clients show the visual in server 11345 (which is Gazebo's default URI)

Saved image from server11345.world (box's face 2 m from the camera)
Saved image from server11345.world

Saved image from server11346.world (box's face 4 m from the camera)
Saved image from server11346.world

Left: 11345 client, Right: 11346 client
Left: 11345 client, Right: 11346 client

Analysis

I have been using this method for running multiple servers simultaneously for a while, and my guess is that after PR #3121, this happened (not 100% sure, but it seems probable since it changed the way of connecting server and client).

I took a quick look in that PR, and noticed the use of Ignition service for the scene info in the world, and the service name is fixed as /scene_info. According to my understanding, Gazebo transport's topics are scoped (when topics have ~/ prefix), and can't be seen over different masters even the world names are the same. Ignition transport's topics and services, on the other hand, can be seen across masters. So this may mean the client gets the wrong information of the scene through a wrong service. It's a bit of a surprise not to get an error trying to advertise to the same service name though.

And if you set up the connection allowing Ignition transport over the network, they will be visible to other PCs inside your network as well. If this is correct, the problem may also occur if more than one PCs in the same Ignition network are running Gazebo at the same time.

My Idea is that the master URI is added as a scope for the service.

@shonigmann
Copy link

+1 - I have also been frustrated by this issue recently and I appreciate knowing that I'm not alone! Would love a fix/workaround if there is one.

@VeerachartZMP
Copy link
Author

I think putting ignition services and topics into a namespace including the Master URI and the port number can help isolating each gazebo, gzserver, gzclient instance (applies to all other ignition services and messages). I don't know if ignition has some easy way to do this.

If just specifically for the scene connection between server and client, adding the Master URI and port number to the Scene message may help. When the client gets the reply message, only uses the one matching its URI and port number, and ignores the rest.

@scpeters
Copy link
Member

I think you may need to set a unique value to the IGN_PARTITION environment variable. See the "Partition and namespaces" section in the following tutorial for more information

Let me know if that is enough to resolve this issue.

@shonigmann
Copy link

The original issue was a bit sporadic for our group but we were able to successfully spin up 10 or so gazebo instances simultaneously without any issues by setting IGN_PARTITION. I'm not sure of a more robust way to make sure this is a catch-all fix, but we're satisfied with @scpeters' suggested fix so far. We'll of update if we do still run into any future issues.

@scpeters
Copy link
Member

scpeters commented Apr 1, 2022

ok, I'll close this for now, but please reopen if the issue persists

@scpeters scpeters closed this as completed Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants