Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strongswan: dhcp plugin not issuing addresses to clients after upgrading to 24.10-rc4 #25801

Open
lowjoel opened this issue Jan 22, 2025 · 31 comments · May be fixed by #25870
Open

strongswan: dhcp plugin not issuing addresses to clients after upgrading to 24.10-rc4 #25801

lowjoel opened this issue Jan 22, 2025 · 31 comments · May be fixed by #25870

Comments

@lowjoel
Copy link
Contributor

lowjoel commented Jan 22, 2025

After upgrading to OpenWrt 24.10-rc4, using the exact same swanctl configuration as 23.05 my clients are no longer getting a DHCP address. This is tested on both Android/Strongswan app as well as Windows IPsec VPN. IPv6 static addresses are still being assigned.

I can tell that the updown scripts in /etc/hotplug.d/ipsec are only being called with IPv6 events, i.e. PLUTO_VERB='up-client-v6' and PLUTO_VERB='down-client-v6', I'm not sure if there's some configuration I'd need to change or if there's something more nefarious going on here.

The logs show that the DISCOVER is sent, dnsmasq received it and returned the OFFER:

Fri Jan 24 21:59:24 2025 authpriv.info ipsec: 09[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Fri Jan 24 21:59:25 2025 authpriv.info ipsec: 09[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Fri Jan 24 21:59:26 2025 authpriv.info ipsec: 11[MGR] ignoring request with ID 12, already processing
Fri Jan 24 21:59:27 2025 authpriv.info ipsec: 09[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Fri Jan 24 21:59:28 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 21:59:28 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 21:59:28 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 21:59:28 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 21:59:28 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 21:59:28 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 21:59:29 2025 authpriv.info ipsec: 08[MGR] ignoring request with ID 12, already processing
Fri Jan 24 21:59:30 2025 authpriv.info ipsec: 09[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Fri Jan 24 21:59:30 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 21:59:30 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 21:59:34 2025 authpriv.info ipsec: 01[MGR] ignoring request with ID 12, already processing
Fri Jan 24 21:59:34 2025 authpriv.info ipsec: 09[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Fri Jan 24 21:59:34 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 21:59:34 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 21:59:39 2025 authpriv.info ipsec: 09[CFG] DHCP DISCOVER timed out

I'm attaching my config here for now while I dig into this; I've still not isolated which part of the config or where I should be looking deeper into at the moment.

/etc/swanctl/conf.d/users.conf

connections {
  users {
    pools = dhcp, lan
    version = 2
    mobike = yes
    rekey_time = 4h
    over_time = 24m
    dpd_delay = 30s
    keyingtries = 3
    proposals = aes256gcm128-prfsha384-ecp384,aes256gcm128-prfsha384-modp2048

    local {
      auth = pubkey
      certs = server.pem
      id = server.fqdn
    }
    remote {
      auth = eap-radius
      eap_id = %any
    }
    children {
      user {
        local_ts = 0.0.0.0/0, ::/0
        esp_proposals = aes256gcm128-curve448,aes256gcm128-modp4096,aes256gcm128-ecp384,aes256gcm128-modp2048
        mode = tunnel
        if_id_in = 32769
        if_id_out = 32769
        rekey_time = 0
        dpd_action = clear

        updown = /usr/lib/ipsec/_updown
      }
    }
  }
}

pools {
  lan {
    addrs = <ipv6-addr>/80
  }
}

/etc/strongswan.d/charon/dhcp.conf

dhcp {

    # Always use the configured server address.
    force_server_address = yes

    # Derive user-defined MAC address from hash of IKE identity and send client
    # identity DHCP option.
    # identity_lease = no

    # Interface name the plugin uses for address allocation.
    # interface =

    # Whether to load the plugin. Can also be an integer to increase the
    # priority of this plugin.
    load = yes

    # DHCP server unicast or broadcast IP address.
    server = <br-lan.lan subnet broadcast, e.g. 192.168.1.255>

    # Use the DHCP server port (67) as source port when a unicast server address
    # is configured.
    # use_server_port = no

}

I did a tcpdump on the affected interface. I observe the DISCOVER but there's no response. I then changed charon/dhcp.conf from server = broadcast address for subnet to the default 255.255.255.255 and instead bound to the interface using the interface directive instead. This time I got an offer, but nothing beyond that.

The logs show that the DHCP conversation then looks like this:

Fri Jan 24 20:20:53 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 20:20:54 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 20:20:55 2025 authpriv.info ipsec: 11[MGR] ignoring request with ID 12, already processing
Fri Jan 24 20:20:56 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 20:20:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 20:20:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 20:20:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 20:20:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 20:20:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Fri Jan 24 20:20:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Fri Jan 24 20:20:58 2025 authpriv.info ipsec: 13[MGR] ignoring request with ID 12, already processing
Fri Jan 24 20:20:59 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 20:20:59 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.1024) mac1
Fri Jan 24 20:20:59 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.1024) dhcp1 mac1
Fri Jan 24 20:21:03 2025 authpriv.info ipsec: 11[MGR] ignoring request with ID 12, already processing
Fri Jan 24 20:21:03 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 20:21:03 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.1024) mac1
Fri Jan 24 20:21:03 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.1024) dhcp1 mac1
Fri Jan 24 20:21:06 2025 authpriv.info ipsec: 13[IKE] retransmit 7 of request with message ID 1
Fri Jan 24 20:21:06 2025 authpriv.info ipsec: 13[NET] sending packet: from local[4500] to remote[4500] (57 bytes)
Fri Jan 24 20:21:08 2025 authpriv.info ipsec: 08[CFG] DHCP DISCOVER timed out

Edit 1: added the connection and dhcp plugin config.
Edit 2: clarified that this affects both Android and Windows clients
Edit 3: tcpdump shows DHCP Discover is being sent but no response
Edit 4: switching to global broadcast address and binding to interface instead gives DISCOVER and OFFER but transaction stops there.
Edit 5: added relevant logs

@lowjoel lowjoel changed the title strongswan: dhcp plugin not issuing addresses to clients strongswan: dhcp plugin not issuing addresses to clients after upgrading to 24.10-rc4 Jan 22, 2025
@tobiasbrunner
Copy link

No idea what the DHCP issue is about (you probably have to provide more details, config, logs, packet captures). But that the updown script is only called for IPv6 makes sense if no IPv4 address is assigned to the client.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 23, 2025

Yeah, I'm trying to understand what's going wrong, I have some digging in to do myself. At this point I can't figure out if I'm the only one impacted (=I did something stupid) or if others are seeing it. I'm currently working to distill my setup to a minimal reproduceable case (I couldn't yet yesterday)

Just added the config that I'm looking at right now.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 24, 2025

I notice that 23.05 used Strongswan 5.9.10, but 24.10 uses 5.9.14; 5.9.14 added an eBPF-based DHCP packet filter; let me see if compiling 5.9.13 works around this.

@tobiasbrunner
Copy link

The packet filtering as such hasn't changed on Linux, it was just refactored to a different file so it can be shared with the farp plugin and to support FreeBSD. You haven't shared your config, logs, or packet captures, so really no idea what's going on. But if you are using a unicast server address, it could be related to the change I pushed as response to your bug report here, which was released with 5.9.11.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 24, 2025

Configs and logs are up in the original post -- changelog added to bottom of post.

Hmm, if the fix landed in 5.9.11 then 23.05 can't possibly be using 5.9.10 🤔
Edit: 23.05 is using 5.9.11.
Edit 2: and (of course) you're right, the BPF code was just moved around.

@tobiasbrunner
Copy link

Oh, sorry, haven't checked there. That log is from after you changed stuff in the config you posted, right? So there does seem to be an offer, but it does not hit the socket. Maybe some firewall filtering? Or maybe it hits a different interface (i.e. not the one bound to)?

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 24, 2025

That log is from after you changed stuff in the config you posted, right

Yes, to bind dhcp.interface = br-lan.blah, unsetting dhcp.server. I should go and edit the order of the description.

Maybe some firewall filtering?

tcpdump shows that it's on the right interface, my pcap (using tcpdump -i br-lan.blah) shows the right offer with the right transaction ID, unless there's something special about vlans on a bridge.

Trying to see if anything on the firewall side has changed but I'm not seeing anything atm.

@tobiasbrunner
Copy link

unless there's something special about vlans on a bridge.

Could be, was that used on 23.05 as well? Not sure how well e.g. SO_BINDTODEVICE works with such interfaces (or with packet sockets as there is actually a note in the man page for socket(7) stating: "It is not supported for packet sockets (use normal bind(2) there)", not sure what that means exactly as the call does not seem to fail), or whether there is something regarding packet sockets and VLANs that could be an issue.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 24, 2025

Could be, was that used on 23.05 as well

Yes, no change on that front. The vlans on that bridge are for DSA. The firewall and network config between the two releases as far as I can tell are the same (and I'm testing this on two different devices of the same model, but unfortunately I upgraded them both to 24.10-rc4 before I realised that this was a problem.)

@tobiasbrunner
Copy link

tobiasbrunner commented Jan 24, 2025

Do you have a pcap file that contains the DHCPOFFER? Maybe our packet filter drops it for some reason.

Edit: Hm, I wonder if the packet is actually prefixed with the VLAN tag when we receive it on the socket.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 24, 2025

Ah. I see a difference in the pcap between the two configs I used. The logs from dnsmasq/strongswan are the same, but the pcap is different.

In both captures, I use tcpdump -n -i br-lan.xxxx -w /tmp/dhcp.pcap --print '(udp port 66)' or '(udp port 67)' which should capture all DHCP traffic.

Initial config: dhcp.interface unset, dhcp.server = br-lan.xxxx broadcast (i.e. dhcp.server = 192.168.1.255)

21:59:24.697491 IP router-addr.68 > subnet-broadcast-addr.67: BOOTP/DHCP, Request from mac1, length 248
21:59:25.698394 IP router-addr.68 > subnet-broadcast-addr.67: BOOTP/DHCP, Request from mac1, length 248
21:59:27.698950 IP router-addr.68 > subnet-broadcast-addr.67: BOOTP/DHCP, Request from mac1, length 248
21:59:30.699510 IP router-addr.68 > subnet-broadcast-addr.67: BOOTP/DHCP, Request from mac1, length 248
21:59:34.700028 IP router-addr.68 > subnet-broadcast-addr.67: BOOTP/DHCP, Request from mac1, length 248

Modified config: dhcp.interface = br-lan.xxxx, dhcp.server unset

22:10:20.880723 IP router-addr.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:10:21.889733 IP router-addr.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:10:23.890633 IP router-addr.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:10:24.030407 IP router-addr.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 334
22:10:24.032360 IP router-addr.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 334
22:10:24.033676 IP router-addr.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 334
22:10:26.891285 IP router-addr.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:10:26.892651 IP router-addr.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 334
22:10:30.891928 IP router-addr.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:10:30.893305 IP router-addr.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 334

@tobiasbrunner
Copy link

tobiasbrunner commented Jan 24, 2025

Modified config: dhcp.interface = br-lan.xxxx, dhcp.server unset

Can you bind to br-lan? If so, does that make a difference?

Edit: By the way, I found some information on how to bind packet sockets to an interface. Might be worth a shot in case the current approach isn't working anymore.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 24, 2025

Modified config: dhcp.interface = br-lan, dhcp.server unset:

Fri Jan 24 22:32:15 2025 authpriv.info ipsec: 05[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 22:32:16 2025 authpriv.info ipsec: 05[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 22:32:18 2025 authpriv.info ipsec: 11[MGR] ignoring request with ID 12, already processing
Fri Jan 24 22:32:18 2025 authpriv.info ipsec: 14[NET] received packet: from remote[4500] to local[4500] (57 bytes)
Fri Jan 24 22:32:18 2025 authpriv.info ipsec: 14[ENC] parsed INFORMATIONAL request 1 [ ]
Fri Jan 24 22:32:18 2025 authpriv.info ipsec: 14[ENC] generating INFORMATIONAL response 1 [ ]
Fri Jan 24 22:32:18 2025 authpriv.info ipsec: 14[NET] sending packet: from local[4500] to remote[4500] (57 bytes)
Fri Jan 24 22:32:18 2025 authpriv.info ipsec: 05[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 22:32:20 2025 authpriv.info ipsec: 08[MGR] ignoring request with ID 12, already processing
Fri Jan 24 22:32:21 2025 authpriv.info ipsec: 05[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 22:32:25 2025 authpriv.info ipsec: 02[MGR] ignoring request with ID 12, already processing
Fri Jan 24 22:32:25 2025 authpriv.info ipsec: 05[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Fri Jan 24 22:32:30 2025 authpriv.info ipsec: 05[CFG] DHCP DISCOVER timed out

tcpdump shows nothing if I listen on the vlan that is for my LAN interface, instead the source IP is my public WAN address (!!) when dumping on br-lan.

22:35:22.261182 IP local-wan-ip.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:35:23.262209 IP local-wan-ip.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:35:25.262732 IP local-wan-ip.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:35:28.263218 IP local-wan-ip.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248
22:35:32.263685 IP local-wan-ip.68 > 255.255.255.255.67: BOOTP/DHCP, Request from mac1, length 248

(times don't match because I captured the logs while tcpdump-ing br-lan.xxxx, ran a second time to capture br-lan directly.) It's late here, I'll continue tomorrow. Thanks for helping me out, @tobiasbrunner

@tobiasbrunner
Copy link

Interesting choice of source address. Why is the WAN interface even attached to the bridge interface?

Anyway, what you could try, if you are able to build a new version of strongSwan, is maybe only bind the send socket and let the packet socket listen on any interface. To do so, replace iface with NULL here.

Alternatively, the following patch uses a different approach to bind the socket to an interface:

diff --git a/src/libcharon/network/pf_handler.c b/src/libcharon/network/pf_handler.c
index 43ef432ba607..60fc7ff6b426 100644
--- a/src/libcharon/network/pf_handler.c
+++ b/src/libcharon/network/pf_handler.c
@@ -225,6 +225,30 @@ METHOD(pf_handler_t, destroy, void,
 	free(this);
 }
 
+/**
+ * Bind the given packet socket to the a named device
+ */
+static bool bind_packet_socket_to_device(int fd, char *iface)
+{
+	struct sockaddr_ll addr = {
+		.sll_family = AF_PACKET,
+		.sll_ifindex = if_nametoindex(iface),
+	};
+
+	if (!addr.sll_ifindex)
+	{
+		DBG1(DBG_CFG, "unable to bind socket to '%s': not found", iface);
+		return FALSE;
+	}
+	if (bind(fd, (struct sockaddr*)&addr, sizeof(addr)) == -1)
+	{
+		DBG1(DBG_CFG, "binding socket to '%s' failed: %s",
+			 iface, strerror(errno));
+		return FALSE;
+	}
+	return TRUE;
+}
+
 /**
  * Setup capturing via AF_PACKET socket
  */
@@ -247,7 +271,7 @@ static bool setup_internal(private_pf_handler_t *this, char *iface,
 			 this->name, strerror(errno));
 		return FALSE;
 	}
-	if (iface && !bind_to_device(this->receive, iface))
+	if (iface && !bind_packet_socket_to_device(this->receive, iface))
 	{
 		return FALSE;
 	}

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 25, 2025

Why is the WAN interface even attached to the bridge interface

I am not sure, but I believe this is how the hardware switch is wired up - wan and lan ports share the same underlying hardware. Or this could be a route table thing, not sure.

maybe only bind the send socket and let the packet socket listen on any interface

I tried this patch as you suggested:

diff --git "a/src/libcharon/plugins/dhcp/dhcp_socket.c" "b/src/libcharon/plugins/dhcp/dhcp_socket.c"
index d144e2795..192912142 100644
--- "a/src/libcharon/plugins/dhcp/dhcp_socket.c"
+++ "b/src/libcharon/plugins/dhcp/dhcp_socket.c"
@@ -873,7 +873,8 @@ dhcp_socket_t *dhcp_socket_create()
 		return NULL;
 	}
 
-	this->pf_handler = pf_handler_create("DHCP", iface, receive_dhcp, this,
+	DBG1(DBG_CFG, "creating handler without filter");
+	this->pf_handler = pf_handler_create("DHCP", NULL, receive_dhcp, this,
 										 &dhcp_filter);
 	if (!this->pf_handler)
 	{

And verified the debug message is printed at startup.

Modified config: dhcp.interface = br-lan.xxxx, dhcp.server unset:

Sat Jan 25 10:28:53 2025 authpriv.info ipsec: 14[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 10:28:54 2025 authpriv.info ipsec: 14[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 10:28:55 2025 authpriv.info ipsec: 16[MGR] ignoring request with ID 12, already processing
Sat Jan 25 10:28:56 2025 authpriv.info ipsec: 14[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 10:28:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:28:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:28:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:28:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:28:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:28:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:28:57 2025 authpriv.info ipsec: 07[MGR] ignoring request with ID 12, already processing
Sat Jan 25 10:28:59 2025 authpriv.info ipsec: 14[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 10:28:59 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:28:59 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:29:03 2025 authpriv.info ipsec: 07[MGR] ignoring request with ID 12, already processing
Sat Jan 25 10:29:03 2025 authpriv.info ipsec: 14[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 10:29:03 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:29:03 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:29:05 2025 authpriv.info ipsec: 12[NET] received packet: from remote[4500] to local[4500] (57 bytes)
Sat Jan 25 10:29:05 2025 authpriv.info ipsec: 12[ENC] parsed INFORMATIONAL request 2 [ ]
Sat Jan 25 10:29:05 2025 authpriv.info ipsec: 12[ENC] generating INFORMATIONAL response 2 [ ]
Sat Jan 25 10:29:05 2025 authpriv.info ipsec: 12[NET] sending packet: from local[4500] to remote[4500] (57 bytes)
Sat Jan 25 10:29:08 2025 authpriv.info ipsec: 14[CFG] DHCP DISCOVER timed out

Modified config: dhcp.interface unset, dhcp.server = subnet-broadcast-addr:


Sat Jan 25 10:35:53 2025 authpriv.info ipsec: 04[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 10:35:54 2025 authpriv.info ipsec: 04[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 10:35:55 2025 authpriv.info ipsec: 08[MGR] ignoring request with ID 12, already processing
Sat Jan 25 10:35:56 2025 authpriv.info ipsec: 04[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 10:35:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:35:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:35:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:35:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:35:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:35:57 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:35:58 2025 authpriv.info ipsec: 09[MGR] ignoring request with ID 12, already processing
Sat Jan 25 10:35:59 2025 authpriv.info ipsec: 04[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 10:35:59 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:35:59 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:36:03 2025 authpriv.info ipsec: 08[MGR] ignoring request with ID 12, already processing
Sat Jan 25 10:36:03 2025 authpriv.info ipsec: 04[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 10:36:03 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 10:36:03 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 10:36:08 2025 authpriv.info ipsec: 04[CFG] DHCP DISCOVER timed out

If we were listening on the bridge (without the VLAN) then the eBPF would need to account for the 802.1q vlan ethertype and the offsets would be shifted? Listening on the vlan should not have the tags though (pcap shows the packet has the IPv4 ethertype). (edit: the Linux BPF seems to use the first byte of the packet to be the IP header, not the ethernet frame, according to your #ifdef.)

Next, I tried your attached patch to modify pf_handler.c to use bind() instead. I added another DBG1 to make sure I'm updating the right binary (in this case libcharon.so) after seeing the output.

Using dhcp.interface unset, dhcp.server = subnet-broadcast-addr:

Sat Jan 25 11:08:08 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 11:08:09 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 11:08:10 2025 authpriv.info ipsec: 10[MGR] ignoring request with ID 12, already processing
Sat Jan 25 11:08:11 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 11:08:12 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:08:12 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:08:12 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:08:12 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:08:12 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:08:12 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:08:13 2025 authpriv.info ipsec: 13[MGR] ignoring request with ID 12, already processing
Sat Jan 25 11:08:14 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 11:08:14 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:08:14 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:08:18 2025 authpriv.info ipsec: 13[MGR] ignoring request with ID 12, already processing
Sat Jan 25 11:08:18 2025 authpriv.info ipsec: 08[CFG] sending DHCP DISCOVER for mac1 to subnet-broadcast-addr
Sat Jan 25 11:08:18 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:08:18 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:08:23 2025 authpriv.info ipsec: 08[CFG] DHCP DISCOVER timed out

(I'm guessing I should stop testing this config...?)

Using dhcp.interface = br-lan.xxxx, dhcp.server unset:

Sat Jan 25 11:14:35 2025 authpriv.info ipsec: 10[IKE] peer requested virtual IP %any
Sat Jan 25 11:14:35 2025 authpriv.info ipsec: 10[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 11:14:36 2025 authpriv.info ipsec: 10[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 11:14:37 2025 authpriv.info ipsec: 03[MGR] ignoring request with ID 12, already processing
Sat Jan 25 11:14:38 2025 authpriv.info ipsec: 10[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 11:14:39 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:14:39 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:14:39 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:14:39 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:14:39 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:14:39 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:14:39 2025 authpriv.info ipsec: 09[MGR] ignoring request with ID 12, already processing
Sat Jan 25 11:14:41 2025 authpriv.info ipsec: 10[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 11:14:41 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:14:41 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:14:44 2025 authpriv.info ipsec: 08[MGR] ignoring request with ID 12, already processing
Sat Jan 25 11:14:45 2025 authpriv.info ipsec: 10[CFG] sending DHCP DISCOVER for mac1 to 255.255.255.255
Sat Jan 25 11:14:45 2025 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan.xxxx) mac1
Sat Jan 25 11:14:45 2025 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan.xxxx) dhcp1 mac1
Sat Jan 25 11:14:50 2025 authpriv.info ipsec: 10[CFG] DHCP DISCOVER timed out

No banana either 🙏

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 26, 2025

I've diagnosed (hur hur, that's an overstatement, BPF was a fun distraction and I was fumbling around a lot) this to the point that:

Using dhcp.interface unset, dhcp.server = subnet-broadcast-addr, none of your binding patches applied (this is my original configuration):

  1. the packets are being received after the BPF packet filter (which matches the observation that tcpdump shows the packets.) I do have a patch that you may consider applying to handle the case where the IPv4 header has a non-default IHL -- I can open a PR for you if you'd like see rfc/fix: tolerate non-default IPv4 IHLs strongswan/strongswan#2635 -- but that wasn't the cause of the problem.
  2. the pf_handler receive_packet is being called (which means that the polling/async jobs are working as expected).
  3. but when receive_packet is called, recvfrom succeeds. However at the next branch calling find_interface, it returns NULL.
  4. The socket that's being read from is lo. dnsmasq is on the localhost of the Strongswan server. req.ifr_hwaddr.sa_family == ARPHRD_LOOPBACK

Applying this patch gives me an IP address:

diff --git "a/src/libcharon/network/pf_handler.c" "b/src/libcharon/network/pf_handler.c"
index 43ef432ba..11245615f 100644
--- "a/src/libcharon/network/pf_handler.c"
+++ "b/src/libcharon/network/pf_handler.c"
@@ -175,14 +175,20 @@ static cached_iface_t *find_interface(private_pf_handler_t *this, int fd,
 	}
 
 	if (ioctl(fd, SIOCGIFNAME, &req) == 0 &&
-		ioctl(fd, SIOCGIFHWADDR, &req) == 0 &&
-		req.ifr_hwaddr.sa_family == ARPHRD_ETHER)
+		ioctl(fd, SIOCGIFHWADDR, &req) == 0)
 	{
 		idx = find_least_used_cache_entry(this);
 
 		this->ifaces[idx].if_index = ifindex;
 		memcpy(this->ifaces[idx].if_name, req.ifr_name, IFNAMSIZ);
-		memcpy(this->ifaces[idx].hwaddr, req.ifr_hwaddr.sa_data, ETHER_ADDR_LEN);
+		if (req.ifr_hwaddr.sa_family == ARPHRD_ETHER)
+		{
+			memcpy(this->ifaces[idx].hwaddr, req.ifr_hwaddr.sa_data, ETHER_ADDR_LEN);
+		}
+		else
+		{
+			memset(this->ifaces[idx].hwaddr, 0, ETHER_ADDR_LEN);
+		}
 		this->ifaces[idx].used = 1;
 		return &this->ifaces[idx];
 	}

Using dhcp.interface = br-lan.xxx, dhcp.server unset, none of your binding patches applied:

  1. can't find the packets being received, though it appears in tcpdump.
  2. I have a sample program to make debugging the BPF part easier; I just cannibalised pf_handler.c and instead of using async poll I just recvfrom while blocking. Same behaviour.

Using dhcp.interface = br-lan.xxx, dhcp.server unset, bind using bind:

  1. can't find the packets being received, though it appears in tcpdump.

@tobiasbrunner
Copy link

Thanks for investigating this further.

  1. I do have a patch that you may consider applying to handle the case where the IPv4 header has a non-default IHL

Not sure if that's really necessary because I doubt many DHCP servers are using IP options (they are rare for IPv4 to begin with).

3. but when receive_packet is called, recvfrom succeeds. However at the next branch calling find_interface, it returns NULL.

Good catch! This part of the code is actually "new", i.e. was not part of the dhcp plugin before the pf_handler refactoring. It's from and for the farp plugin, which needs to know the MAC address of the interface on which a packet was received to send back a fake ARP response. The dhcp plugin does not care for the address or the interface name and index.

I've pushed a slightly different fix to the dhcp-receive branch.

4. dnsmasq is on the localhost of the Strongswan server.

Is that the default or a special config? What happens if you set dhcp.interface = lo?

Using dhcp.interface = br-lan.xxx, dhcp.server unset, none of your binding patches applied:

I feel this should work (with the loopback patch) if you'd remove the interface from the pf_handler constructor.

Using dhcp.interface = br-lan.xxx, dhcp.server unset, bind using bind:

Makes sense as packets from lo will be ignored. I wonder if something like dhcp.interface_receive would be useful to specify a separate interface to bind the receive socket to.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 27, 2025

I've pushed a slightly different fix to the dhcp-receive branch.

Let me check it out and compile it and give it a shot, thanks.

Is that the default or a special config? What happens if you set dhcp.interface = lo?

I'm sorry. I think I was a bit imprecise with that comment. I meant that dnsmasq is on the same host as strongswan; in that sense, dnsmasq is localhost relative to charon.

To be precise, my dnsmasq.conf currently has these network settings set:

bind-interfaces
local-service
listen-address=router-address-v4 (e.g 192.168.1.1)
listen-address=router-address-v6 (e.g. 2001:...::1)
listen-address=::1
interface=br-lan.xxxx
no-dhcp-interface=wan

among others (which I think are uninteresting wrt this conversation.) Why is it that when I set the dhcp.server to the router address subnet's broadcast address dnsmasq replies on lo, I do not know. I have a setup in which I have different instances of dnsmasq running on different interfaces because each interface gets a different DNS suffix list.

I feel this should work (with the loopback patch) if you'd remove the interface from the pf_handler constructor.

Let me try this on the weekend.

I wonder if something like dhcp.interface_receive would be useful to specify a separate interface to bind the receive socket to.

It depends on how dnsmasq sends the reply or if this is a bug somewhere else in the stack.

@tobiasbrunner
Copy link

Let me check it out and compile it and give it a shot, thanks.

Thanks.

I'm sorry. I think I was a bit imprecise with that comment. I meant that dnsmasq is on the same host as strongswan; in that sense, dnsmasq is localhost relative to charon.
...
Why is it that when I set the dhcp.server to the router address subnet's broadcast address dnsmasq replies on lo, I do not know.

I see. Even though dnsmasq probably binds its socket to the configured interface, it could just be the kernel delivering the response on lo because the destination address is local to the host (even if not actually on lo).

It depends on how dnsmasq sends the reply or if this is a bug somewhere else in the stack.

At least it would allow binding both sockets (or only one). dhcp.interface = lo won't work because the request isn't received by dnsmasq on the bridge interface, while binding the latter won't work if the response is delivered via lo.

@DocMAX
Copy link

DocMAX commented Jan 28, 2025

Same problem here. Getting a "DHCP DISCOVER timed out" although dnsmasq does offer an IP. I am on 24.10, too.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 29, 2025

I've pushed a slightly different fix to the dhcp-receive branch.

It seems to work, thanks. I'm not sure what kind of MAC address is going to be copied, hence I did the more conservative 00:00:00:00:00:00.

At least it would allow binding both sockets (or only one)

Correct; but it feels messy to me. I'm not sure if this is a one-off thing or if Linux has been doing this since forever. I'm also not sure how this would play with scenarios in which we may use e.g. RADIUS to decide which DHCP server to get addresses from. For example, I may want to assign different groups of users different IP address ranges and they have to get their addresses from different DHCP servers (not sure if that's possible; I'm pulling examples out of thin air.) In this specific case, allowing addresses from either ethernet/loopback seems to work here. @DocMAX do you want to try the patch for yourself too?

@tobiasbrunner
Copy link

I've pushed a slightly different fix to the dhcp-receive branch.

It seems to work, thanks.

Thanks for testing. I'll wait for some additional feedback from @DocMAX (see here) and then will line it up for the next release.

I'm not sure what kind of MAC address is going to be copied, hence I did the more conservative 00:00:00:00:00:00.

That's already the default address if nothing else is configured (see e.g. ip link). But you can freely set a different address on lo (ip link set dev lo address ..., no idea what purpose or side-effects that could have), so passing the actual address along can't hurt.

I'm not sure if this is a one-off thing or if Linux has been doing this since forever.

I'm pretty sure it has been like that forever. This is, of course, only a problem when running the DHCP server on the same host as the DHCP client, which is kind of a special setup. For instance, the force_server_address was also added because of that. Without it, the client would send the DHCP ACK message to the (local) unicast address from which the DHCP OFFER was sent, which would also go via lo, on which the DHCP server might not be listening. So by enabling that option and configuring the subnet's broadcast address, the DHCP ACK is again sent via the interface on which the server is listening (see the first footnote in the docs).

I guess binding to an interface is more useful when the server is not running on the same host. But having the option to configure the interface for the receive socket separately might allow more generic configs (e.g. without having to know the subnet's local broadcast address) even when running on the same host.

I've pushed a small change that implements this to the branch. Let me know what you think.

I'm also not sure how this would play with scenarios in which we may use e.g. RADIUS to decide which DHCP server to get addresses from. For example, I may want to assign different groups of users different IP address ranges and they have to get their addresses from different DHCP servers (not sure if that's possible; I'm pulling examples out of thin air.)

That's currently not possible. But you can let the RADIUS server assign the virtual IP directly.

@DocMAX
Copy link

DocMAX commented Jan 29, 2025

dhcp {
    identity_lease = yes
    load = yes
    force_server_address = yes
    server = 192.168.1.255
}

Patch from strongswan/strongswan@a7341cf doesn't work with this config.

@tobiasbrunner
Copy link

Do you actually have the patch applied correctly? And please post the log.

@DocMAX
Copy link

DocMAX commented Jan 29, 2025

Applying /home/docmax/openwrt/sources/snapshot/feeds/packages/net/strongswan/patches/0901-uci-verbatim-patch-from-openwrt-package-sources.patch using plaintext:
patching file src/libcharon/plugins/uci/uci_parser.c

Applying /home/docmax/openwrt/sources/snapshot/feeds/packages/net/strongswan/patches/0903-updown-Call-sbin-hotplug-call-ipsec-1-in-updown-scri.patch using plaintext:
patching file src/_updown/_updown.in

Applying /home/docmax/openwrt/sources/snapshot/feeds/packages/net/strongswan/patches/0904-gmpdh-Plugin-that-implements-gmp-DH-functions-in-an-.patch using plaintext:
patching file configure.ac
patching file src/libstrongswan/Makefile.am
patching file src/libstrongswan/plugins/gmpdh/Makefile.am
patching file src/libstrongswan/plugins/gmpdh/gmpdh_plugin.c
patching file src/libstrongswan/plugins/gmpdh/gmpdh_plugin.h

Applying /home/docmax/openwrt/sources/snapshot/feeds/packages/net/strongswan/patches/0905-dhcp.patch using plaintext:
patching file src/libcharon/network/pf_handler.c
touch /home/docmax/openwrt/sources/snapshot/build_dir/target-mips_24kc_musl/strongswan-5.9.14/.prepared_e9647582c88dfa58f8fddbb0c5df408e_6664517399ebbbc92a37c5bb081b5c53
rm -f /home/docmax/openwrt/sources/snapshot/build_dir/target-mips_24kc_musl/strongswan-5.9.14/.configured_*
rm -f /home/docmax/openwrt/sources/snapshot/staging_dir/target-mips_24kc_musl/stamp/.strongswan_installed
(cd /home/docmax/openwrt/sources/snapshot/build_dir/target-mips_24kc_musl/strongswan-5.9.14; rm -f aclocal.m4; if [ -f ./configure.ac ] || [ -f ./configure.in ]; then [ -d ./autom4te.cache ] && rm -rf ./a
utom4te.cache; [ -e ./config.rpath ] || ln -s /home/docmax/openwrt/sources/snapshot/scripts/config.rpath ./config.rpath; touch NEWS AUTHORS COPYING ABOUT-NLS ChangeLog; AUTOM4TE=/home/docmax/openwrt/sourc
es/snapshot/staging_dir/host/bin/autom4te AUTOCONF=/home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/autoconf AUTOMAKE=/home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/automake ACLOCAL=
/home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/aclocal AUTOHEADER=/home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/autoheader LIBTOOLIZE=/home/docmax/openwrt/sources/snapshot/staging
_dir/host/bin/libtoolize LIBTOOL=/home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/libtool M4=/home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/m4 AUTOPOINT=true GTKDOCIZE=true LIBTOOLIZ
E='/home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/libtoolize --install' /home/docmax/openwrt/sources/snapshot/staging_dir/host/bin/autoreconf -v -f -i -B /home/docmax/openwrt/sources/snapshot/
staging_dir/host/share/aclocal -I /home/docmax/openwrt/sources/snapshot/staging_dir/target-mips_24kc_musl/host/share/aclocal -I /home/docmax/openwrt/sources/snapshot/staging_dir/hostpkg/share/aclocal -I /
home/docmax/openwrt/sources/snapshot/staging_dir/target-mips_24kc_musl/usr/share/aclocal -I m4 -I . . || true; fi; );
OpenWrt-libtoolize: putting auxiliary files in '.'.
OpenWrt-libtoolize: copying file './config.guess'
OpenWrt-libtoolize: copying file './config.sub'
OpenWrt-libtoolize: copying file './install-sh'

This is the building process for OpenWrt. "patching file src/libcharon/network/pf_handler.c" returns no error so i think its applied correctly. patch file is https://github.com/strongswan/strongswan/commit/a7341cf23bc2fd01af7eff729894ff2a7014263b.patch

And the ipsec log is:

Jan 29 19:34:02 router ipsec: 09[NET] received packet: from 109.42.112.171[11000] to 95.119.147.36[500] (948 bytes)
Jan 29 19:34:02 router ipsec: 09[ENC] parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) N(HASH_ALG) N(REDIR_SUP) ]
Jan 29 19:34:02 router ipsec: 09[IKE] 109.42.112.171 is initiating an IKE_SA
Jan 29 19:34:02 router ipsec: 09[CFG] selected proposal: IKE:AES_CBC_256/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_3072
Jan 29 19:34:02 router ipsec: 09[IKE] remote host is behind NAT
Jan 29 19:34:02 router ipsec: 09[IKE] DH group ECP_256 unacceptable, requesting MODP_3072
Jan 29 19:34:02 router ipsec: 09[ENC] generating IKE_SA_INIT response 0 [ N(INVAL_KE) ]
Jan 29 19:34:02 router ipsec: 09[NET] sending packet: from 95.119.147.36[500] to 109.42.112.171[11000] (38 bytes)
Jan 29 19:34:02 router ipsec: 01[NET] received packet: from 109.42.112.171[11000] to 95.119.147.36[500] (1268 bytes)
Jan 29 19:34:02 router ipsec: 01[ENC] parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) N(HASH_ALG) N(REDIR_SUP) ]
Jan 29 19:34:02 router ipsec: 01[IKE] 109.42.112.171 is initiating an IKE_SA
Jan 29 19:34:02 router ipsec: 01[CFG] selected proposal: IKE:AES_CBC_256/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_3072
Jan 29 19:34:02 router ipsec: 01[IKE] remote host is behind NAT
Jan 29 19:34:02 router ipsec: 01[ENC] generating IKE_SA_INIT response 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) N(HASH_ALG) N(CHDLESS_SUP) N(MULT_AUTH) ]
Jan 29 19:34:02 router ipsec: 01[NET] sending packet: from 95.119.147.36[500] to 109.42.112.171[11000] (600 bytes)
Jan 29 19:34:03 router ipsec: 15[NET] received packet: from 109.42.112.171[3326] to 95.119.147.36[4500] (1356 bytes)
Jan 29 19:34:03 router ipsec: 15[ENC] parsed IKE_AUTH request 1 [ EF(1/3) ]
Jan 29 19:34:03 router ipsec: 15[ENC] received fragment #1 of 3, waiting for complete IKE message
Jan 29 19:34:03 router ipsec: 07[NET] received packet: from 109.42.112.171[3326] to 95.119.147.36[4500] (1356 bytes)
Jan 29 19:34:03 router ipsec: 07[ENC] parsed IKE_AUTH request 1 [ EF(2/3) ]
Jan 29 19:34:03 router ipsec: 07[ENC] received fragment #2 of 3, waiting for complete IKE message
Jan 29 19:34:03 router ipsec: 10[NET] received packet: from 109.42.112.171[3326] to 95.119.147.36[4500] (588 bytes)
Jan 29 19:34:03 router ipsec: 10[ENC] parsed IKE_AUTH request 1 [ EF(3/3) ]
Jan 29 19:34:03 router ipsec: 10[ENC] received fragment #3 of 3, reassembled fragmented IKE message (3144 bytes)
Jan 29 19:34:03 router ipsec: 10[ENC] parsed IKE_AUTH request 1 [ IDi N(INIT_CONTACT) CERTREQ CPRQ(ADDR ADDR6 DNS DNS6) SA TSi TSr N(MOBIKE_SUP) N(NO_ADD_ADDR) N(MULT_AUTH) N(EAP_ONLY) N(MSG_ID_SYN_SUP) ]
Jan 29 19:34:03 router ipsec: 10[IKE] received 135 cert requests for an unknown ca
Jan 29 19:34:03 router ipsec: 10[CFG] looking for peer configs matching 95.119.147.36[%any]...109.42.112.171[jellystar]
Jan 29 19:34:03 router ipsec: 10[CFG] selected peer config 'ikev2-eap-mschapv2'
Jan 29 19:34:03 router ipsec: 10[IKE] initiating EAP_IDENTITY method (id 0x00)
Jan 29 19:34:03 router ipsec: 10[IKE] peer supports MOBIKE
Jan 29 19:34:03 router ipsec: 10[IKE] authentication of 'mydomain.com' (myself) with ECDSA_WITH_SHA256_DER successful
Jan 29 19:34:03 router ipsec: 10[IKE] sending end entity cert "CN=mydomain.com"
Jan 29 19:34:03 router ipsec: 10[IKE] sending issuer cert "C=US, O=Let's Encrypt, CN=E6"
Jan 29 19:34:03 router ipsec: 10[ENC] generating IKE_AUTH response 1 [ IDr CERT CERT AUTH EAP/REQ/ID ]
Jan 29 19:34:03 router ipsec: 10[ENC] splitting IKE message (2232 bytes) into 2 fragments
Jan 29 19:34:03 router ipsec: 10[ENC] generating IKE_AUTH response 1 [ EF(1/2) ]
Jan 29 19:34:03 router ipsec: 10[ENC] generating IKE_AUTH response 1 [ EF(2/2) ]
Jan 29 19:34:03 router ipsec: 10[NET] sending packet: from 95.119.147.36[4500] to 109.42.112.171[3326] (1244 bytes)
Jan 29 19:34:03 router ipsec: 10[NET] sending packet: from 95.119.147.36[4500] to 109.42.112.171[3326] (1084 bytes)
Jan 29 19:34:03 router ipsec: 07[NET] received packet: from 109.42.112.171[3326] to 95.119.147.36[4500] (104 bytes)
Jan 29 19:34:03 router ipsec: 07[ENC] parsed IKE_AUTH request 2 [ EAP/RES/ID ]
Jan 29 19:34:03 router ipsec: 07[IKE] received EAP identity 'jellystar'
Jan 29 19:34:03 router ipsec: 07[IKE] initiating EAP_MSCHAPV2 method (id 0xED)
Jan 29 19:34:03 router ipsec: 07[ENC] generating IKE_AUTH response 2 [ EAP/REQ/MSCHAPV2 ]
Jan 29 19:34:03 router ipsec: 07[NET] sending packet: from 95.119.147.36[4500] to 109.42.112.171[3326] (120 bytes)
Jan 29 19:34:04 router ipsec: 09[NET] received packet: from 109.42.112.171[3326] to 95.119.147.36[4500] (152 bytes)
Jan 29 19:34:04 router ipsec: 09[ENC] parsed IKE_AUTH request 3 [ EAP/RES/MSCHAPV2 ]
Jan 29 19:34:04 router ipsec: 09[ENC] generating IKE_AUTH response 3 [ EAP/REQ/MSCHAPV2 ]
Jan 29 19:34:04 router ipsec: 09[NET] sending packet: from 95.119.147.36[4500] to 109.42.112.171[3326] (152 bytes)
Jan 29 19:34:04 router ipsec: 07[NET] received packet: from 109.42.112.171[3326] to 95.119.147.36[4500] (88 bytes)
Jan 29 19:34:04 router ipsec: 07[ENC] parsed IKE_AUTH request 4 [ EAP/RES/MSCHAPV2 ]
Jan 29 19:34:04 router ipsec: 07[IKE] EAP method EAP_MSCHAPV2 succeeded, MSK established
Jan 29 19:34:04 router ipsec: 07[ENC] generating IKE_AUTH response 4 [ EAP/SUCC ]
Jan 29 19:34:04 router ipsec: 07[NET] sending packet: from 95.119.147.36[4500] to 109.42.112.171[3326] (88 bytes)
Jan 29 19:34:04 router ipsec: 06[NET] received packet: from 109.42.112.171[3326] to 95.119.147.36[4500] (136 bytes)
Jan 29 19:34:04 router ipsec: 06[ENC] parsed IKE_AUTH request 5 [ AUTH ]
Jan 29 19:34:04 router ipsec: 06[IKE] authentication of 'jellystar' with EAP successful
Jan 29 19:34:04 router ipsec: 06[IKE] authentication of 'mydomain.com' (myself) with EAP
Jan 29 19:34:04 router ipsec: 06[IKE] peer requested virtual IP %any
Jan 29 19:34:04 router ipsec: 06[CFG] sending DHCP DISCOVER for 7a:a7:0c:c0:aa:64 to 192.168.1.255
Jan 29 19:34:05 router ipsec: 06[CFG] sending DHCP DISCOVER for 7a:a7:0c:c0:aa:64 to 192.168.1.255
Jan 29 19:34:06 router ipsec: 16[MGR] ignoring request with ID 5, already processing
Jan 29 19:34:07 router ipsec: 06[CFG] sending DHCP DISCOVER for 7a:a7:0c:c0:aa:64 to 192.168.1.255
Jan 29 19:34:08 router ipsec: 16[MGR] ignoring request with ID 5, already processing
Jan 29 19:34:10 router ipsec: 06[CFG] sending DHCP DISCOVER for 7a:a7:0c:c0:aa:64 to 192.168.1.255
Jan 29 19:34:13 router ipsec: 02[MGR] ignoring request with ID 5, already processing
Jan 29 19:34:14 router ipsec: 06[CFG] sending DHCP DISCOVER for 7a:a7:0c:c0:aa:64 to 192.168.1.255
Jan 29 19:34:19 router ipsec: 06[CFG] DHCP DISCOVER timed out
Jan 29 19:34:19 router ipsec: 06[IKE] no virtual IP found for %any requested by 'jellystar'
Jan 29 19:34:19 router ipsec: 06[IKE] peer requested virtual IP %any6
Jan 29 19:34:19 router ipsec: 06[IKE] no virtual IP found for %any6 requested by 'jellystar'
Jan 29 19:34:19 router ipsec: 06[IKE] no virtual IP found, sending INTERNAL_ADDRESS_FAILURE
Jan 29 19:34:19 router ipsec: 06[IKE] IKE_SA ikev2-eap-mschapv2[4] established between 95.119.147.36[mydomain.com]...109.42.112.171[jellystar]
Jan 29 19:34:19 router ipsec: 06[IKE] scheduling rekeying in 13674s
Jan 29 19:34:19 router ipsec: 06[IKE] maximum IKE_SA lifetime 15114s
Jan 29 19:34:19 router ipsec: 06[IKE] configuration payload negotiation failed, no CHILD_SA built
Jan 29 19:34:19 router ipsec: 06[IKE] failed to establish CHILD_SA, keeping IKE_SA
Jan 29 19:34:19 router ipsec: 06[ENC] generating IKE_AUTH response 5 [ AUTH N(MOBIKE_SUP) N(ADD_4_ADDR) N(INT_ADDR_FAIL) ]
Jan 29 19:34:19 router ipsec: 06[NET] sending packet: from 95.119.147.36[4500] to 109.42.112.171[3326] (168 bytes)
Jan 29 19:34:24 router ipsec: 06[NET] received packet: from 109.42.112.171[26376] to 95.119.147.36[500] (948 bytes)
Jan 29 19:34:24 router ipsec: 06[ENC] parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) N(HASH_ALG) N(REDIR_SUP) ]
Jan 29 19:34:24 router ipsec: 06[IKE] 109.42.112.171 is initiating an IKE_SA
Jan 29 19:34:24 router ipsec: 06[CFG] selected proposal: IKE:AES_CBC_256/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_3072
Jan 29 19:34:24 router ipsec: 06[IKE] remote host is behind NAT
Jan 29 19:34:24 router ipsec: 06[IKE] DH group ECP_256 unacceptable, requesting MODP_3072
Jan 29 19:34:24 router ipsec: 06[ENC] generating IKE_SA_INIT response 0 [ N(INVAL_KE) ]
Jan 29 19:34:24 router ipsec: 06[NET] sending packet: from 95.119.147.36[500] to 109.42.112.171[26376] (38 bytes)
Jan 29 19:34:24 router ipsec: 02[NET] received packet: from 109.42.112.171[26376] to 95.119.147.36[500] (1268 bytes)
Jan 29 19:34:24 router ipsec: 02[ENC] parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) N(HASH_ALG) N(REDIR_SUP) ]
Jan 29 19:34:24 router ipsec: 02[IKE] 109.42.112.171 is initiating an IKE_SA
Jan 29 19:34:24 router ipsec: 02[CFG] selected proposal: IKE:AES_CBC_256/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_3072

And dnsmasq log is:

Jan 29 20:34:07 router dnsmasq-dhcp[9377]: DHCPDISCOVER(br-lan) 7a:a7:0c:c0:aa:64
Jan 29 20:34:07 router dnsmasq-dhcp[9377]: DHCPOFFER(br-lan) 192.168.1.201 7a:a7:0c:c0:aa:64
Jan 29 20:34:07 router dnsmasq-dhcp[9377]: DHCPDISCOVER(br-lan) 7a:a7:0c:c0:aa:64
Jan 29 20:34:07 router dnsmasq-dhcp[9377]: DHCPOFFER(br-lan) 192.168.1.201 7a:a7:0c:c0:aa:64
Jan 29 20:34:07 router dnsmasq-dhcp[9377]: DHCPDISCOVER(br-lan) 7a:a7:0c:c0:aa:64
Jan 29 20:34:07 router dnsmasq-dhcp[9377]: DHCPOFFER(br-lan) 192.168.1.201 7a:a7:0c:c0:aa:64
Jan 29 20:34:10 router dnsmasq-dhcp[9377]: DHCPDISCOVER(br-lan) 7a:a7:0c:c0:aa:64
Jan 29 20:34:10 router dnsmasq-dhcp[9377]: DHCPOFFER(br-lan) 192.168.1.201 7a:a7:0c:c0:aa:64
Jan 29 20:34:14 router dnsmasq-dhcp[9377]: DHCPDISCOVER(br-lan) 7a:a7:0c:c0:aa:64
Jan 29 20:34:14 router dnsmasq-dhcp[9377]: DHCPOFFER(br-lan) 192.168.1.201 7a:a7:0c:c0:aa:64

@tobiasbrunner
Copy link

Thanks for the details. As far as I can tell, this looks fine. But just to clarify, the settings you posted above are the only place where the dhcp plugin is configured? For instance, there is nothing in strongswan.conf itself or any of the other files that are included by it? If so, the question is what difference there is to @lowjoel's setup that could affect this. Maybe some firewall rules? Or some sysctl settings?

@DocMAX
Copy link

DocMAX commented Jan 30, 2025

Like @lowjoel my config used to work on v23.05. Nothing changed here. I try to tinker a bit more but right now no idea what to do now.

strongswan.conf looks like this:

charon {
    load_modular = yes
    dns1 = 10.0.0.1
    nbns1 = 10.0.0.1
    plugins {
        include strongswan.d/charon/*.conf
    }
}

include strongswan.d/*.conf
include /var/ipsec/strongswan.conf

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 30, 2025

Another thing to check is to bump the logging for NET because that's what the pf_handler messages will appear as AFAIK. I also added logging in the patches I made to make sure the right binary is loaded. To be clear, pf_handler changes need an updated libcharon.so.

@DocMAX
Copy link

DocMAX commented Jan 30, 2025

OK, it works. My fault. I didn't flash the correct rom file. Applied the last 2 patches of the "dhcp-receive" strongswan branch.
So not sure which one is responsible for the fix.
So please add those patches officially to the openwrt strongswan package.
Thank you.

@lowjoel
Copy link
Contributor Author

lowjoel commented Jan 30, 2025

@tobiasbrunner, since @DocMAX has tested it and the second patch looks reasonable to me (I haven't had a chance to test), perhaps you can merge it first and then I can open PRs to queue these for OpenWrt 24.10. Or else @DocMAX can open the PRs too, that's fine.

@tobiasbrunner
Copy link

OK, it works.

Great, thanks for testing.

Applied the last 2 patches of the "dhcp-receive" strongswan branch.
So not sure which one is responsible for the fix.

Only the first one is necessary to fix the issue (i.e. to make the existing configuration work again).

The other did actually not have an effect at all. Because some testing showed that binding packet sockets via SO_BINDTODEVICE really doesn't work at all. So the patch I posted here before, which uses bind(), is necessary to enforce any binding on the receive socket. I've now merged that and the other commits (strongswan/strongswan@c2e5c00, or just the fix strongswan/strongswan@abbf9d2).

lowjoel added a commit to lowjoel/openwrt-packages that referenced this issue Jan 31, 2025
Fixes openwrt#25801. Adds the following commits to fix DHCP behaviour on
Strongswan 5.9.14:

 - strongswan/strongswan@abbf9d2
 - strongswan/strongswan@00d8c36
 - strongswan/strongswan@a50ed30

Signed-off-by: Joel Low <joel@joelsplace.sg>
@lowjoel lowjoel linked a pull request Jan 31, 2025 that will close this issue
lowjoel added a commit to lowjoel/openwrt-packages that referenced this issue Feb 1, 2025
Fixes openwrt#25801. Adds the following commits to fix DHCP behaviour on
Strongswan 5.9.14:

 - strongswan/strongswan@abbf9d2
 - strongswan/strongswan@00d8c36
 - strongswan/strongswan@a50ed30

Signed-off-by: Joel Low <joel@joelsplace.sg>
lowjoel added a commit to lowjoel/openwrt-packages that referenced this issue Feb 1, 2025
Fixes openwrt#25801. Adds the following commits to fix DHCP behaviour on
Strongswan 5.9.14:

 - strongswan/strongswan@abbf9d2
 - strongswan/strongswan@00d8c36
 - strongswan/strongswan@a50ed30

Signed-off-by: Joel Low <joel@joelsplace.sg>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants