Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot SSH into DRBL server from a client machine #28

Open
4bluegen2s opened this issue Apr 3, 2023 · 3 comments
Open

Cannot SSH into DRBL server from a client machine #28

4bluegen2s opened this issue Apr 3, 2023 · 3 comments

Comments

@4bluegen2s
Copy link

I have an MPI cluster I do some work on at home.

My DRBL server is hosted as VM on my NAS, and is more powerful than my compute nodes. Ideally, I want it to be a part of the cluster after it has served the clients their OS and file system.

I found that when I run
mpiexec -np 12 -hostfile /mirror/cluster_machines python /mirror/approx_pi.py
I get the following:
[proxy:0:0@cn1] HYDU_sock_connect (utils/sock/sock.c:145): unable to connect from "cn1" to "drbl" (connection refused)
[proxy:0:0@cn1] main (pm/pmiserv/pmip.c:183): unable to connect to server drbl at port 42597 (check for firewalls!)
[proxy:0:1@cn2] HYDU_sock_connect (utils/sock/sock.c:145): unable to connect from "cn2" to "drbl" (connection refused)
[proxy:0:1@cn2] main (pm/pmiserv/pmip.c:183): unable to connect to server drbl at port 42597 (check for firewalls!)

But when I run it on cn1 alone, using just cn1 and cn2 as workers, they talk to each other and run fine.

I also noticed I can ssh into cn1. I get the prompt:
node@cn1:~$
and from there I can ssh into cn2, and get the correct prompt:
node@cn2:~$
but if I try to ssh into drbl from a client, it seems to log in fine, but the prompt stays the same as the client I tried to log in from. Additionally, when I type "exit" and get out of the ssh session, it says:
logout Connection to drbl closed.
although it doesn't seem to have ever really logged in..

Can someone help me with the firewall settings so I can add my DRBL server into my cluster?

@stevenshiau
Copy link
Owner

You did not mention how you configure your DRBL server. Since it can be full DRBL mode, DRBL-SSI mode, etc...
In addition, so you issue is when you login from cn1/cn2 to your DRBL server? Or? I am actually confused.

Steven

@4bluegen2s
Copy link
Author

It is set up in Full DRBL mode.
You are correct, the issue is when I try to login from cn1/cn2 to the DRBL server.

@stevenshiau
Copy link
Owner

I can not reproduce this issue on my Debian Bullseye server. I can from login its DRBL client via console, and ssh login into the DRBL server.
Make sure you have set your client in "remote-linux-gra" or "remote-linux-txt" when running "sudo dcs".
BTW, which GNU/Linux did you configure for your DRBL server? It would be better if you can run "drbl-bug-report" and share the generated file.
In addition, if you can, please give unstable DRBL a try, i.e., drbl 5.2.9.

Steven

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants