Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services published using python-zerconf or systemd-resolved are not resolved #182

Closed
hrzlgnm opened this issue Mar 17, 2024 · 17 comments · Fixed by #185
Closed

Services published using python-zerconf or systemd-resolved are not resolved #182

hrzlgnm opened this issue Mar 17, 2024 · 17 comments · Fixed by #185

Comments

@hrzlgnm
Copy link
Contributor

hrzlgnm commented Mar 17, 2024

Example python code to publish a service:

from zeroconf import Zeroconf, ServiceInfo
from socket import gethostname

zeroconf = Zeroconf()
service_type = "_workstation._tcp.local."

zeroconf.register_service(
    ServiceInfo(
        service_type,
        f"worky-station.{service_type}",
        4848,
        server=f"{gethostname()}.local",
    )
)

try:
    input("Press enter to exit...\n\n")
finally:
    zeroconf.close()

Running avahi-browse -tpr _workstation._tcp on the same Linux machine yields resolved results

+;eno33554984;IPv4;worky-station;Workstation;local
+;eno16777736;IPv4;worky-station;Workstation;local
+;lo;IPv4;worky-station;Workstation;local
=;eno33554984;IPv4;worky-station;Workstation;local;void-vm.local;192.168.178.76;4848;
=;eno16777736;IPv4;worky-station;Workstation;local;void-vm.local;192.168.73.130;4848;
=;lo;IPv4;worky-station;Workstation;local;void-vm.local;127.0.0.1;4848;

Example program i used to test resolving using mdns-sd

use mdns_sd::{ServiceDaemon, ServiceEvent};

fn main() {
    let mdns = ServiceDaemon::new().expect("Failed to create daemon");
    let receiver = mdns.browse("_workstation._tcp.local.").expect("Failed to browse");
    let mut search_done = false;
    while let Ok(event) = receiver.recv() {
        match event {
            ServiceEvent::ServiceResolved(info) => {
                println!(
                    "Resolved a new service: {} host: {} port: {} IP: {:?} TXT properties: {:?}",
                    info.get_fullname(),
                    info.get_hostname(),
                    info.get_port(),
                    info.get_addresses(),
                    info.get_properties(),
                );
            }
            ServiceEvent::SearchStarted(_service) => {
                if search_done {
                    mdns.stop_browse(srv).expect("To stop browsing");
                }
                search_done = true;
            }
            ServiceEvent::SearchStopped(_service) => {
                break;
            }
            _ => {}
        }
    }
    mdns.shutdown().unwrap();
}

What I also noticed, when I use avahi-publish-service worky-station _workstation._tcp. 4848, it can be resolved successfully using the obove rust program example.

@keepsimple1
Copy link
Owner

I tried it locally. It seems that Python zeroconf did not send / respond with address records (TYPE_A or TYPE_AAAA). I haven't got chance to find out if / how avahi-browse used other means to resolve the address for the host name.

The example query program in mdns-sd shows it found the instance, but couldn't resolve it fully (due to missing address records)

$ cargo run --example query _workstation._tcp
   Compiling mdns-sd v0.10.4 (/Users/hanxu/work/mdns-sd)
    Finished dev [unoptimized + debuginfo] target(s) in 3.29s
     Running `target/debug/examples/query _workstation._tcp`
At 191.994µs : SearchStarted("_workstation._tcp.local. on addrs [192.168.0.108, fe80::1, fe80::f884:fdff:fe05:b1ff, fe80::e20:39ca:3827:464, fe80::f071:231b:9e7:14fa, fe80::8646:3c7b:acfb:2d5c, fe80::10c7:e8af:585a:448f, fe80::8949:ca7e:9b05:10c4, fe80::ce81:b1c:bd2c:69e]")
At 110.160432ms : ServiceFound("_workstation._tcp.local.", "worky-station._workstation._tcp.local.")
<snip>

@keepsimple1
Copy link
Owner

keepsimple1 commented Mar 18, 2024

I opened a PR #183 with some debugging code to find that python zeroconf actually included a NSEC record that shows the lack of IPv4 and IPv6 addresses.

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 18, 2024

My guess is, avahi is using the associated server name from the SRV record to resolve addresses when A and AAAA records are not present.

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 18, 2024

Out of interest i grepped a bit through the code of avahi, and it seems that the naem in TYPE_SRV record is used to resolve the TYPE_A or TYPE_AAAA address: See https://github.com/avahi/avahi/blob/v0.8/avahi-core/resolve-service.c#L221 and following

@keepsimple1
Copy link
Owner

keepsimple1 commented Mar 18, 2024

yes I suspected the same. And I updated the PR #183 to use regular lookups (std::net) to resolve the address if we detect no address and NSEC record shows the instance explicitly says they don't have the addresses.

In my testing, the PR's patch is able to resolve your original python zeroconf instance:

run $ cargo run --example query _workstation._tcp :

At 238.854773ms: Resolved a new service: worky-station._workstation._tcp.local. host: MBP-9.local. port: 4848 IP: {127.0.0.1, fe80::10c7:e8af:585a:448f, fe80::1, ::1, 192.168.0.108} TXT properties: TxtProperties { properties: [] }

P.S. there is one potential issue with my current patch: to_socket_addrs is a blocking call, hence causing delays if the hostname lookup fails. I'm trying to find optimizations. But let me know if the current patch works for you or not. Thanks.

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

Thanks for looking into this, i tried out the debug-resolve branch, it seems to work on windows only for me. Unfortunately it does not work on Linux for me. I guess we need to send those queries via mDNS also.

PS: I don't have a Mac.

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

I can provide network traces of avahi resolving this, if you like.

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

avahi-resolve.zip
Here is a network trace of the case where avahi resolves the service running on Linux

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

I've created a minor pr to you #183 branch where it works for linux, test on windows to be done soon (tm)

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

When i run this on linux, with the python program on the same host I get:

At 120.686µs : SearchStarted("_workstation._tcp.local. on addrs [fe80::f141:2ded:3ab3:9970, 192.168.122.79]")
At 112.859681ms : ServiceFound("_workstation._tcp.local.", "worky-station._workstation._tcp.local.")
At 113.153913ms: Resolved a new service: worky-station._workstation._tcp.local. host: void-vm.local. port: 4848 IP: {192.168.122.79} TXT properties: TxtProperties { properties: [] }
At 113.167118ms: Resolved a new service: worky-station._workstation._tcp.local. host: void-vm.local. port: 4848 IP: {fe80::f141:2ded:3ab3:9970, 192.168.122.79} TXT properties: TxtProperties { properties: [] }

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

Also got results on windows while the program was running in a linux vm:

At 186µs : SearchStarted("_workstation._tcp.local. on addrs [192.168.178.25, 192.168.49.1, fe80::2b19:d118:3bdc:d9e8, 2003:e8:bf3e:3f00:e073:5eb2:2493:93a, fe80::e5b8:2dca:ee23:5df4, 2003:e8:bf3e:3f00:17e2:ac9b:ef52:7729, fe80::f900:996d:50d1:8349, 192.168.73.1]")
At 1.4355ms : ServiceFound("_workstation._tcp.local.", "homeassistant [07ec3e0c8c864037bbd53d1ef63a9d3c]._workstation._tcp.local.")
At 66.9015ms : ServiceFound("_workstation._tcp.local.", "vm-worky-station._workstation._tcp.local.")
At 68.5139ms: Resolved a new service: vm-worky-station._workstation._tcp.local. host: void-vm.local. port: 4848 IP: {192.168.178.76} TXT properties: TxtProperties { properties: [] }
At 68.6539ms: Resolved a new service: vm-worky-station._workstation._tcp.local. host: void-vm.local. port: 4848 IP: {192.168.178.76, 192.168.73.130} TXT properties: TxtProperties { properties: [] }
At 69.0566ms: Resolved a new service: vm-worky-station._workstation._tcp.local. host: void-vm.local. port: 4848 IP: {192.168.73.130, 192.168.178.76, 2003:e8:bf3e:3f00:938d:8ea8:42c0:f758} TXT properties: TxtProperties { properties: [] }

PS: ignore the homeasistant [...] thing there, that's probably a bug in a homeassistant addon advertising internal docker addresse which cannot be reached anyway...

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

Actually I was wrong about this "homeassistant [...]", the home assistant operating system publishes this record using systemd-resolved since home-assistant/operating-system@25a0dd3
And sending only mulicast queries doesn't seem be enough. Avahi is also able to resolve those, and sends both a Multicast and a Unicast query simultaneously, as can be seen above in the network trace for case for the python-zerconf test.
Perhaps sending a Mulicast Query alone is not enough.

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

I've updated my pr #184 once again which entirely skips the NSEC check and now i'm also able to resolve the

At 503.610675ms: Resolved a new service: homeassistant [07ec3e0c8c864037bbd53d1ef63a9d3c]._workstation._tcp.local. host: homeassistant.local. port: 0 IP: {192.168.178.70} TXT properties: TxtProperties { properties: [] }

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

I first was wondering why it didn't work first, your branch was missing the fix from #181

@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

Tested my fix on windows, and it also works there:

At 2.6834ms : ServiceFound("_workstation._tcp.local.", "homeassistant [07ec3e0c8c864037bbd53d1ef63a9d3c]._workstation._tcp.local.") 
At 510.0461ms: Resolved a new service: homeassistant [07ec3e0c8c864037bbd53d1ef63a9d3c]._workstation._tcp.local. host: homeassistant.local. port: 0 IP: {192.168.178.70} TXT properties: TxtProperties { properties: [] }

@hrzlgnm hrzlgnm changed the title Services published using python-zerconf are not resolved Services published using python-zerconf or systemd-resolved are not resolved Mar 19, 2024
@hrzlgnm
Copy link
Contributor Author

hrzlgnm commented Mar 19, 2024

@keepsimple1 If you like I can submit a PR only containing the actual fix that works for me without your debuggin NSEC changes.

@keepsimple1
Copy link
Owner

@keepsimple1 If you like I can submit a PR only containing the actual fix that works for me without your debuggin NSEC changes.

yes, that will be great! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants