Skip to content

added :localhost argument to :erl_empd.names() #179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AdSkipper1337
Copy link

I was using the library with one of my projects, and today I got myself a new dedicated server from the german ionos cloud, and I tried to run an elixir mix release of one of my applications directly on the server. (both build machine and server are Ubuntu 22.04 lts, amd64)
But it did not work, when I was debugging the issue, I noticed that :erl_epmd.names was causing the bug.
The bug is of the nature that it hangs up for ~30 seconds before returning a {:error, :address}, thus causing libcluster to crash the application. I tried installing erlang libraries on the server but it does not fix anything.
However, then I added the :localhost atom as the argument to :erl_epmd.names() function call, it fixed the issue and now everything works as intended. It is very strange because localhost is supposed to be the default value.
I am not able to delve deeper into the issue then this surface-level analysis, as I did not find anything online, and I am not an expert on the low-level erlang stuff.
However, this is a working fix, I tried it in other environments, and it works as expected. I want to stress that I have never had this issue before with different servers, perhaps it is caused by strange server defaults, which are a thing with ionos.
If you have any other questions regarding this issue, feel free to ask me I will help you out in any way I can. For my own deployment, I have made a custom Cluster. Strategy, so it works either way for me, however, I hope with this fix a similar issue might be avoided for someone else.

Cheers

@aedwardg
Copy link

I just ran into this also (almost 3 years later).
I had a working Phoenix app, that I've been starting locally with iex --sname nodeone -S mix phx.server for the past two years.
Just upgraded my OS to MacOS Sequoia (M1 Pro chip) today and suddenly it stopped working.
Tried everything imaginable but but nothing worked until adding this. Not sure why upgrading MacOS would make this necessary though.

@petermueller
Copy link

petermueller commented May 27, 2025

I've similarly run into this (unrelated, hey @aedwardg!) but it was combo of :inet.gethosthame, macOS's frustrating DNS + VPN implementation and tailscale. I was having the same error, but not waiting 30s, since the name was resolving to a different IP that wasn't running epmd.

Explanation is below of my root cause analysis of what brought me here, but I am also good on merging this, or at least making it something that can be configured to allow an override, as I would not have expected that the cause of my error (see below) would be it attempting to cluster over the network with a co-worker's laptop with the same name 😬


For future people:

So :erl_epmd.name/0 defaults to call :inet.gethostname/0 which for me showed ~c"Mac" (macOS's default hostname I think) and :inet.getaddr(~c"Mac", :inet) was returning a tailscale IP address (not mine) for someone else's host named "mac" (tailscale mapped mine to "mac-1" ). So basically, because LocalEpmd strategy doesn't pass :localhost, it relies on :erl_epmd's fallback to using whatever is your local DNS resolution order, and because of how Apple has implemented VPNs where it forces the VPN provider to update the DNS too, and because MagicDNS puts tailscale's DNS first over "localdomain", then it means that LocalEpmd is actually attempting to cluster w/ someone on tailscale with the same local hostname 😅

I had previously changed my hostname, but due to weirdness in macOS, even though the OS gui reported my hostname as updated weeks ago (including multiple restarts), it still showed hostname as "Mac". Tailscale has updated their implementation on macOS to work around this I think as well.

So if you're on macOS 15.x (Sequoia) and seeing errors about LocalEpmd failing with {:error, :address}, and you have a VPN, or lots of computers on your network, check that you don't have a name conflict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants