Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SARA-R4] MQTT publisher not working - Impossible to connect to broker #21764

Closed
gpaquet85 opened this issue Jan 8, 2020 · 26 comments · Fixed by #22149
Closed

[SARA-R4] MQTT publisher not working - Impossible to connect to broker #21764

gpaquet85 opened this issue Jan 8, 2020 · 26 comments · Fixed by #22149
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@gpaquet85
Copy link
Contributor

Describe the bug
I see an issue when I try to use MQTT Publisher through SARA R4 modem driver. Indeed, it is impossible to conenct to MQTT server/broker.
I checked that Network connection was ok.

To Reproduce
Compiling https://github.com/zephyrproject-rtos/zephyr/tree/master/samples/net/mqtt_publisher::

  1. mkdir build; cd build
  2. cmake -DBOARD=boards\nrf52840_pca10059 -DSHIELD=sparkfun_sara_r4 -DCONF_FILE="prj.conf overlay-wncm14a2a.conf" (I am not sure of this configuration because I did this with our own board)
  3. make
  4. See error

Expected behavior
When tryin to connect to MQTT broker, I always have following issue:
MQTT connect failed -111
So it is impossible to communicate.

Impact
No MQTT communication on my board which is protocol chosen for our need by our customer.

Screenshots or console output
image

Environment (please complete the following information):

  • OS: Linux
  • Toolchain: ARM-GCC
  • Tag: v2.0

Additional context
I have follwoing traces when I snif UART bus
AT+USOCR=6
+USOCR: 0

OK
AT+USOCO=0,"83.166.154.87",1883
OK
AT+USOWR=0,30
@
+USOWR: 0,30

OK
AT+USOWR=0,30
@
+USOWR: 0,30

OK
AT+USOWR=0,30
@
+USOWR: 0,30

OK

+UUSORD: 0,4
AT+USOWR=0,30
@
+CME ERROR: 3

@gpaquet85 gpaquet85 added the bug The issue is a bug, or the PR is fixing a bug label Jan 8, 2020
@jukkar jukkar added the priority: low Low impact/importance bug label Jan 8, 2020
@jukkar
Copy link
Member

jukkar commented Jan 8, 2020

If I understood correctly from report at mailing list, the system worked ok in 1.14
See https://lists.zephyrproject.org/g/devel/message/6622 for details.

@WilliamGFish
Copy link
Collaborator

Have a look at the error "NET_SOCKET_OFFLOAD must be configured for this driver"
Looks like you've not reconfigured your board after the changes implemented in v2.1.

Billy..

static const struct socket_offload modem_socket_offload = {
	.socket = offload_socket,
	.close = offload_close,
	.bind = offload_bind,
	.connect = offload_connect,
	.poll = offload_poll,
	.recv = offload_recv,
	.recvfrom = offload_recvfrom,
	.send = offload_send,
	.sendto = offload_sendto,
};

static int net_offload_dummy_get(sa_family_t family,
				 enum net_sock_type type,
				 enum net_ip_protocol ip_proto,
				 struct net_context **context)
{

	LOG_ERR("NET_SOCKET_OFFLOAD must be configured for this driver");

	return -ENOTSUP;
}

/* placeholders, until Zepyr IP stack updated to handle a NULL net_offload */
static struct net_offload modem_net_offload = {
	.get = net_offload_dummy_get,
};

@gpaquet85
Copy link
Contributor Author

gpaquet85 commented Jan 8, 2020

If I understood correctly from report at mailing list, the system worked ok in 1.14
See https://lists.zephyrproject.org/g/devel/message/6622 for details.

You can understand it like that because I previously worked on zephyr 1.14.1 and I cherry-picked commits to be able to have UBLOX-SARA-R4 driver. This worked pretty good and I did MQTT communications on any MQTT broker (Orange-liveobject, our own mosquitto broker on infomaniak server, mosquitto.org, ...)

@gpaquet85
Copy link
Contributor Author

Have a look at the error "NET_SOCKET_OFFLOAD must be configured for this driver"
Looks like you've not reconfigured your board after the changes implemented in v2.1.

Billy..

static const struct socket_offload modem_socket_offload = {
	.socket = offload_socket,
	.close = offload_close,
	.bind = offload_bind,
	.connect = offload_connect,
	.poll = offload_poll,
	.recv = offload_recv,
	.recvfrom = offload_recvfrom,
	.send = offload_send,
	.sendto = offload_sendto,
};

static int net_offload_dummy_get(sa_family_t family,
				 enum net_sock_type type,
				 enum net_ip_protocol ip_proto,
				 struct net_context **context)
{

	LOG_ERR("NET_SOCKET_OFFLOAD must be configured for this driver");

	return -ENOTSUP;
}

/* placeholders, until Zepyr IP stack updated to handle a NULL net_offload */
static struct net_offload modem_net_offload = {
	.get = net_offload_dummy_get,
};

From my point of view LOG_ERR("NET_SOCKET_OFFLOAD must be configured for this driver"); is not really important because this should only create UDP socket what is not necessary for our need.
I am trying to revert last commits on SARA R4 driver to see if it's work on zephyr2.0.
And as I mentionned above, I am pretty sure that +UUSORD management is not the same on zephyr 2.0. maybe I missed something ;)

@gpaquet85
Copy link
Contributor Author

@WilliamGFish , @jukkar , @mike-scott I just tried to revert the 2 following commits:

It is now working so I am pretty sure that regression come from this one (ebf6520#diff-cb3c7d77935fa5df6ccb74a1c5e050ba)
Or maybe I missed something ?

thanks again for your help on this case.

@WilliamGFish
Copy link
Collaborator

Oh no; now I'm looking at this with the U201 version of a Boron board and have found a few odd errors.

The change was moving to the MODEM_CONTEXT which will effect all AT command driven devices.

Billy..

@WilliamGFish
Copy link
Collaborator

At a quick glance, it looks like what you have reverted was the change to SOCKET_OFFLOAD back to NET_OFFLOAD, which had been working previously.

@gpaquet85
Copy link
Contributor Author

At a quick glance, it looks like what you have reverted was the change to SOCKET_OFFLOAD back to NET_OFFLOAD, which had been working previously.

I also think so.

@WilliamGFish
Copy link
Collaborator

I put some DBG hex-dump calls in the modem driver and am concerned with the number of rapid calls to the MQTT server, as seen in your message board post.

This appears to be the potential issue or it relates to the polling function which the MQTT client relies on to understand the server connection messages.

More digging tomorrow.

@WilliamGFish
Copy link
Collaborator

@mike-scott
Big question: Why did we move away from net offload and using the module IP stack to using socket offloading? The net offloading is the basis for modem/wifi drivers that are currently used in Zephyr baring the SimpleLink one.

Would it not make sense to revert to using the net (MODEM_RECEIVER) rather than socket offloading (MODEM_CONTEXT) for future portability.? The creation of a generic AT modem context was to minimise duplicate code and future maintenance but we appear to be having issues with basic connect tests.

@ghost
Copy link

ghost commented Jan 20, 2020

DNS isn't working either on the v2.0 branch, maybe the same reason?

[00:00:44.886,840] <err> modem_ublox_sara_r4: NET_SOCKET_OFFLOAD must be configured for this driver
[00:00:44.886,840] <dbg> net_dns_resolve.dns_resolve_init: (0x20004e98): Cannot get net_context (-35)
[00:00:44.886,840] <wrn> net_dns_resolve: Cannot initialize DNS resolver (-35)

An alternative to offloading for these modems would be +CMUX multiplexing and PPP. The current solution works well with UDP but not for TCP, because the modem handles the TCP acks and checksums and it doesn't know about bytes lost on the modem UART.

@ghost
Copy link

ghost commented Jan 20, 2020

Sorry for the noise; the DNS calls are just missing in modem_socket_offload. Will try to fix.

@gpaquet85
Copy link
Contributor Author

Sorry for the noise; the DNS calls are just missing in modem_socket_offload. Will try to fix.

Ok thanks for this

@mike-scott
Copy link
Contributor

@gpaquet85 @weinholtendian @WilliamGFish
Hello all,
I had some time to finally sit down and look at this issue. There are several problems that I found. I've submitted an initial PR here:
#22149

@mike-scott
Copy link
Contributor

@WilliamGFish I don't think we need to revert back to modem receiver. I think there's a few bugs that need to be ironed out.

@WilliamGFish
Copy link
Collaborator

@WilliamGFish I don't think we need to revert back to modem receiver. I think there's a few bugs that need to be ironed out.

@mike-scott I'm confused as to the move to SOCKET_OFFLOAD from NET_OFFLOAD. It to me seems better to standardise on the NET_OFFLOAD model, as we are handing off the networking to be managed by the 'modem'.

@mike-scott
Copy link
Contributor

@mike-scott I'm confused as to the move to SOCKET_OFFLOAD from NET_OFFLOAD. It to me seems better to standardise on the NET_OFFLOAD model, as we are handing off the networking to be managed by the 'modem'.

The SOCKET_OFFLOAD offers a much richer API for offloaded device functionality in my opinion. NET_OFFLOAD only supports the most basic network functions:

  • get (socket), bind, listen, connect, accept, send, recv, put (socket)

Where SOCKET_OFFLOAD offers all of these in a POSIX-ike API:

  • with the addition of poll, setsockopt / getsockopt (think TLS setup), getaddrinfo and freeaddrinfo for offloaded DNS support, etc.

In addition to the extra APIs, the memory model for the socket layer is easier to use and sometimes less memory intensive because you can copy buffers when the user calls recv.

@ghost
Copy link

ghost commented Jan 24, 2020

I have a branch with various fixes for the SARA driver, including a basic getaddrinfo implementation that I have checked works with the LwM2M stack and an R412M modem (though offloaded DNS is much less reliable on NB-IoT AFAICT): https://github.com/endiantechnologies/zephyr/commits/v2.0-endian/sara

The changes should be verified with SARA R410M and SARA U201.

@mike-scott Does SOCKET_OFFLOAD mean that we can't use Zephyr's mbedTLS anymore, that we should ask the modem to configure TLS for us? If that is the case then I think we have a problem, because AFAICT the modem doesn't support DTLS, which means no security for LwM2M.

@mike-scott
Copy link
Contributor

@mike-scott Does SOCKET_OFFLOAD mean that we can't use Zephyr's mbedTLS anymore, that we should ask the modem to configure TLS for us? If that is the case then I think we have a problem, because AFAICT the modem doesn't support DTLS, which means no security for LwM2M.

Worst case would be using Zephyr's TLS implementation via the setsockopt layers.

The other idea I had was to let the modem driver implementation choose the better of NET_OFFLOAD or NET_SOCKETS_OFFLOAD.

@mike-scott
Copy link
Contributor

@weinholtendian I also have a DNS patch for SARA-R4 which I never pushed upstream due to the need for newer firmward on the R410M :/
mike-scott@bfccf42

That's the old version, I've been play testing it with non-LwM2M use-cases and realize that it needs a bit of patching up.

@mike-scott
Copy link
Contributor

@weinholtendian If you don't mind I'd like to pull in some of your patches and add them to this: #22149

@ghost
Copy link

ghost commented Jan 27, 2020

@mike-scott Go right ahead.

BTW, do we deal anywhere with the 1024 byte (512 hex) payload limit per AT command? MQTT is TCP and is more sensitive to this (although it is probably not what happened here). Suppose the modem says "+UUSORD: 0,1500". Then we can use "AT+USORD=0,1024" followed by "AT+USORD=0,476" when not using hex mode. I had a quick look and didn't see anything to handle this.

@hwilmers
Copy link
Contributor

Hi, I issued a PR that deals with the 1024 byte (512 hex) limit when sending, and that reads the number of bytes sent as reported by the modem:
#22232

It shouldn't be difficult to do something similar when receiving.

@mike-scott
Copy link
Contributor

mike-scott commented Jan 27, 2020

Hi, I issued a PR that deals with the 1024 byte (512 hex) limit when sending, and that reads the number of bytes sent as reported by the modem:
#22232

It shouldn't be difficult to do something similar when receiving.

@hwilmers Thank you for the patches! I'm testing a recv() patch locally.

May I cherry-pick your patches into my "fixes" PR? #22149

@hwilmers
Copy link
Contributor

@mike-scott of course - just go ahead and cherry pick.

@mike-scott
Copy link
Contributor

@gpaquet85 @WilliamGFish @weinholtendian @hwilmers
I've finished the full set of patches with fixes for SARA-R4:
#22149

This includes a few patches from @weinholtendian @hwilmers.

And a few resulting patches for MQTT:
#22249

Let me know how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants