Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown sequence advice (possible issue?) #147

Open
shiretu opened this issue May 22, 2017 · 17 comments
Open

Shutdown sequence advice (possible issue?) #147

shiretu opened this issue May 22, 2017 · 17 comments

Comments

@shiretu
Copy link
Contributor

shiretu commented May 22, 2017

Hi,

I need to tunnel the SCTP over DTLS. For that, I am associating _pDTLS pointer to the socket like this:

	//store our DTLS, and register the socket
	usrsctp_register_address(_pDTLS);

	//initialize the SCTP connecting address
	_connectAddress.sconn_family = AF_CONN;
	_connectAddress.sconn_port = EHTONS(_port);
	_connectAddress.sconn_addr = _pDTLS;

	//try to bind the socket
	if (usrsctp_bind(_pSCTPSocket, (struct sockaddr *) &_connectAddress,
			sizeof (struct sockaddr_conn)) != 0) {
		int err = errno;
		FATAL("Unable to connect the SCTP socket: (%d) %s", err, strerror(err));
		return false;
	}

This way, when the low level SCTP conn_output static callback is called, I get the first argument to be a void * which is in fact my DTLS context. Which is perfect.

When SCTP/DTLS dies or gets closed explicitly, I do this:

	//unregister the ULP
	usrsctp_deregister_address(_pDTLS);

	//close the socket
	if (_pSCTPSocket != NULL) {
		usrsctp_set_ulpinfo(_pSCTPSocket, NULL);
		usrsctp_shutdown(_pSCTPSocket, SHUT_RDWR);
		usrsctp_close(_pSCTPSocket);
		_pSCTPSocket = NULL;
	}

Here is the issue now:
When the conn_output callback is called, I still get the address as being a corpse pointer to _pDTLS, despite the fact I have called usrsctp_deregister_address and usrsctp_set_ulpinfo with NULL. That generates a crash sometimes, because I will try to call stuff from _pDTLS in the destructor part, which is bad (calling pure virtual functions for example). I can protect against this situation, but the code is quite ugly.

Ideally, when usrsctp_deregister_address(_pDTLS); is called, all the internal structures holding a void * equal with the provided argument should be set to NULL. Basically, all pending stuff in internal queues. After usrsctp_deregister_address/usrsctp_set_ulpinfo is called, there should be absolutely no call made to the conn_output callback for that particular address, or at least the provided pointer should be NULL, so a simple if statement can protect against bad things.

@shiretu
Copy link
Contributor Author

shiretu commented May 22, 2017

More details: usrsctp_shutdown(_pSCTPSocket, SHUT_RDWR); will generate a call (on the same stack/thread) to my conn_output callback. Which is consequently trying to use _pDTLS which I have reset for good reasons (it died or was closed) by using usrsctp_deregister_address(_pDTLS); and usrsctp_set_ulpinfo(_pSCTPSocket, NULL); calls

@sancane
Copy link

sancane commented Aug 4, 2017

Hi there,
How is this issue going on? I'm experiencing the same problem described here.

sancane added a commit to sancane/openwebrtc-gst-plugins that referenced this issue Aug 4, 2017
…n cast to 'GstSctpAssociation'

This patch prevent the issue posted here regarding usrsctp library:
sctplab/usrsctp#147
@tuexen
Copy link
Member

tuexen commented Aug 4, 2017

The problem is that you are assuming that the stack does not send a packet anymore after calling usrsctp_shutdown(_pSCTPSocket, SHUT_RDWR), which is wrong. This call only triggers the graceful shutdown procedure which means that there will be at least a 3-way packet exchange.

You can trigger the association teardown procedure with this call using usrsctp_shutdown(_pSCTPSocket, SHUT_WR), wait for a notification that the association teardown is complete and then remove your DTLS stack. The SCTP association has to be turned down before you can turn down DTLS.

I hope that helps.

@tuexen
Copy link
Member

tuexen commented Aug 4, 2017

Another comment: Why are you calling usrsctp_close(_pSCTPSocket); after usrsctp_shutdown(_pSCTPSocket, SHUT_RDWR);. Only calling close() has the same effect. However, you won't be able to receive any notifications, since you have closed the socket. This is normal socket API behaviour.

@shiretu
Copy link
Contributor Author

shiretu commented Aug 4, 2017

That is not always under our control. DTLS can die (and, subsequently, be deallocated) before we even call shutdown. In other words, they have independent life cycles. What was suggested above is to artificially maintain that pointer alive, essentially tightly couple them, which is not good. It is a lot more difficult than one may think.

Also, usrsctp_deregister_address(_pDTLS); and/or usrsctp_set_ulpinfo(_pSCTPSocket, NULL); should reset all user defined pointer(s) associated with the address (or socket) for all internal structures that the library uses. This way, as explained above, when the static send is called next time, it will be called with NULL, which is perfect to know that there is no more transport, and efficiently drop the packets.

@shiretu
Copy link
Contributor Author

shiretu commented Aug 4, 2017

Want to point out that this is the same issue that kqueue is having. To put it in more generic form, all APIs offering a way to register an user pointer, should have a way to also reset that. Otherwise, complicated code is needed to maintain that association alive and kicking

@shiretu
Copy link
Contributor Author

shiretu commented Aug 4, 2017

@sancane: In my case i have solved this issue by registering an ID instead the pointer. Is fine-ish, but far from ideal, since every send mandates a binary tree search.

@sancane
Copy link

sancane commented Aug 4, 2017

@tuexen IMHO I think that leaving the SCTP library the management of the pointer life cicle is not such a good idea as long as it does not give you the option to set a destroy function to be called when SCTP does not need it anymore, it sounds me like a clumsy asynchronous API implementation, what if the shutdown procedure can not conclude successfully for any reason?, as @shiretu says, I think that the minimum one could expect of a function called usrsctp_deregister_address is that once you call it the library does not use that memory anymore.

@tuexen
Copy link
Member

tuexen commented Aug 4, 2017

@sancane usrsctp_register_address () and usrsctp_deregister_address() is being called by the lower layer to let the SCTP stack know, which addresses are available. This includes IPv4 addresses, IPv6 Addresses and addresses of the AF_CONN family to deal with stacks like SCTP/DTLS. These addresses can come and go at any time and the SCTP stack has, for examples, to keep track to which set of end-points an end-point has been bound. Therefore we keep our own list of structures and use refcounting. If the lower layer registers an address, increment it, if a lower layer deregisters it we decrement it, if an endpoint uses it, we increment it and so on. This allows the SCTP stack to deal with the lower layer registering and deregistering addresses at anytime. For the SCTP stack it is just an address (like IPv4, IPv6 or one of the AF_CONN family). The SCTP stack uses the address even if it was deregistered, which is fine in a lot of cases.

I can look into a way to make sure the SCTP does not use addresses once they are deregistered.
@sancane Would that help?

@shiretu I understand that a lower layer can go away at any time. What I'm not understanding is that you want to reliably teardown the association in that case? That requires a message exchange, which you already know is not possible anymore. I would suggest to un-gracefully shutdown the association in that case. This would still trigger the sending of an ABORT-chunk, but it does not takes several retransmissions when you already know they won't make it to the peer.

@tuexen
Copy link
Member

tuexen commented Aug 4, 2017

@shiretu When are you calling

	//unregister the ULP
	usrsctp_deregister_address(_pDTLS);

	//close the socket
	if (_pSCTPSocket != NULL) {
		usrsctp_set_ulpinfo(_pSCTPSocket, NULL);
		usrsctp_shutdown(_pSCTPSocket, SHUT_RDWR);
		usrsctp_close(_pSCTPSocket);
		_pSCTPSocket = NULL;
	}

Are calls using _pDTLS still allowed? Or is it already too late?
Calling usrsctp_set_ulpinfo() just changes the ulp_info provided to the receive_cb. So it is related to the interface to the upper layer. usrsctp_deregister_address() deals with the interface to the lower layer, which is what you want here.

@shiretu
Copy link
Contributor Author

shiretu commented Aug 4, 2017

@tuexen: no, not really. When that code is executed, the graceful shutdown is no longer an option, as the DTLS is long gone. That is just me trying (in my failed understanding) to stop SCTP to send more packets. If there is a guaranteed way of stopping any kind of outbound traffic occurring, that would be super, because that will solve my issue: stop SCTP lib calling the send callback on that dismantling-in-progress stack of SCTP/DTLS, or when DTLS is already gone.

i would also make use of the ABORT sending, I could just intercept the few bytes from traffic, see that is an ABORT, and suppress the real sending.

Overall, we need a way to put a hard stop of any and all sends from a point forward in time, and that point, ideally, would be defined by calling a function in the SCTP stack with necessary info to let it know which sock should be discontinued. But again, if that call itself generates more sends by itself, that is not that great

@shiretu
Copy link
Contributor Author

shiretu commented Aug 4, 2017

An explanation from a different perspective:

int SCTP::CallbackSendData(void *pAddress, void *pBuffer, size_t length, uint8_t tos,
		uint8_t set_df) {
...
}

That is the callback body with the corresponding parameters. At that point in the call stack, without any other information, just having the provided parameters, one should be able to determine the transport layer, and if is there or not.

@lgrahl
Copy link
Contributor

lgrahl commented Aug 4, 2017

If there is a guaranteed way of stopping any kind of outbound traffic occurring, that would be super, because that will solve my issue: stop SCTP lib calling the send callback on that dismantling-in-progress stack of SCTP/DTLS, or when DTLS is already gone.

Yes, this is possible but it results in an ABORT being sent out when an association is being teared down. Not sure this is the best of defences but every data channel implementation I've seen does it atm, too. 😉

I can look into a way to make sure the SCTP does not use addresses once they are deregistered.

@tuexen: I'd upvote that as this is a common issue hindering all data channel implementations I've seen so far (which includes Google's and Mozilla's implementation) from an easy graceful shutdown implementation. Edit: However, while this would help for already deallocated DTLS transports and such, it doesn't answer the question when the closing sequence is completed, all pending data has been sent/received and we can safely deallocate our memory. This event needs to be crystal clear towards all implementations that supply a conn_output callback to usrsctp_init. I think it would be sufficient if the readme states which SCTP event needs to be registered and watched out for this purpose.

@shiretu
Copy link
Contributor Author

shiretu commented Aug 4, 2017

@lgrahl: I'm aware of that, but that does not qualify for stopping any kind of outbound traffic occurring. that is the whole point

@lgrahl
Copy link
Contributor

lgrahl commented Aug 4, 2017

@shiretu Can't you call usrsctp_close just before the DTLS instance is deallocated? Because with these settings you will get one last call to conn_output triggered by usrsctp_close but that's the last one (maybe this code example is more precise).

@shiretu
Copy link
Contributor Author

shiretu commented Aug 4, 2017

In theory, of course I can. In practice, also can, almost clean. But the reason for opening this ticket is to make it dead simple and without having to know that it will trigger just one pkt more. To be honest, up until today, did not even knew that ABORT was the only packet sent after. Nor that it is sent from same thread (same call stack) or different thread. The user of the library should not be concerned with these kind of details IMHO.

pererikb pushed a commit to EricssonResearch/openwebrtc-gst-plugins that referenced this issue Aug 7, 2017
…n cast to 'GstSctpAssociation'

This patch prevent the issue posted here regarding usrsctp library:
sctplab/usrsctp#147

Closes #59.
@alexander-vi
Copy link

Found nice workaround for this issue. The main idea to use private API of the library.

// Here we forward declare internal functions of usrsctp which use before shutdown to reset callbacks.
// All memory access is synchronized internally.
extern "C" {
int register_recv_cb (struct socket *,
                      int (*)(struct socket *, union sctp_sockstore, void *, size_t,
                              struct sctp_rcvinfo, int, void *));
int register_send_cb (struct socket *, uint32_t, int (*)(struct socket *, uint32_t));
int register_ulp_info (struct socket *, void *);
} // extern "C"`

// ...
// and some time later
register_ulp_info(m_socket, nullptr);
register_recv_cb(m_socket, nullptr);
register_send_cb(m_socket, 0, nullptr);

if (usrsctp_shutdown(m_socket, SHUT_WR) < 0) {
    WARNING(sctp, " Error while shutdown socket" << ": " << errno << ": " << strerror(errno), obj(m_log_tag_collector));
}

chehefen added a commit to awslabs/amazon-kinesis-video-streams-webrtc-sdk-c that referenced this issue Apr 3, 2020
chehefen added a commit to awslabs/amazon-kinesis-video-streams-webrtc-sdk-c that referenced this issue Apr 3, 2020
chehefen added a commit to awslabs/amazon-kinesis-video-streams-webrtc-sdk-c that referenced this issue Apr 8, 2020
chehefen added a commit to awslabs/amazon-kinesis-video-streams-webrtc-sdk-c that referenced this issue Apr 8, 2020
chehefen added a commit to awslabs/amazon-kinesis-video-streams-webrtc-sdk-c that referenced this issue Apr 8, 2020
sancane added a commit to sancane/usrsctp that referenced this issue Oct 11, 2020
This patch tries to fix the problem of usrsctp calling callbacks even
after the usrsctp_close function is invoked.
Usrsctp seems to keep the asociation around until the teardown procedure
is completed. The problem is that sometimes it is not possible to
complete the shutdown procedure if the lower transport is gone like SCTP
running on top of DTLS. In such case, callbacks triggered from usrsctp
are providing an pointer address to an applications which could not be
valid or dealocated.
To let application know when they can safetely deallocate memory
registered with an association, we are storing a destroy function which
will be used for usrsctp to notify application when the association is
gone.
Related issues:
sctplab#405
sctplab#147
sancane added a commit to sancane/usrsctp that referenced this issue Oct 11, 2020
This patch tries to fix the problem of usrsctp calling callbacks even
after the usrsctp_close function is invoked.
Usrsctp seems to keep the asociation around until the teardown procedure
is completed. The problem is that sometimes it is not possible to
complete the shutdown procedure if the lower transport is gone like SCTP
running on top of DTLS. In such case, callbacks triggered from usrsctp
are providing an pointer address to an applications which could have
been deallocated.
To let applications know when they can safetely deallocate memory
registered with an association, we are storing a destroy function which
will be used for usrsctp to notify applications when the association is
gone.
Related issues:
sctplab#405
sctplab#147
sancane added a commit to sancane/usrsctp that referenced this issue Oct 11, 2020
This patch tries to fix the problem of usrsctp calling callbacks even
after the usrsctp_close function is invoked.
Usrsctp seems to keep the asociation around until the teardown procedure
is completed. The problem is that sometimes it is not possible to
complete the shutdown procedure if the lower transport is gone like SCTP
running on top of DTLS. In such case, callbacks triggered from usrsctp
are providing an pointer address to an applications which could have
been deallocated.
To let applications know when they can safely deallocate memory
registered with an association, we are storing a destroy function which
will be used for usrsctp to notify applications when the association is
gone.
Related issues:
sctplab#405
sctplab#147
niyatim23 pushed a commit to awslabs/amazon-kinesis-video-streams-webrtc-sdk-c that referenced this issue Aug 19, 2021
niyatim23 pushed a commit to awslabs/amazon-kinesis-video-streams-webrtc-sdk-c that referenced this issue Aug 19, 2021
tbeloqui added a commit to pexip/gstreamer that referenced this issue Aug 4, 2023
This patch tries to fix the problem of usrsctp calling callbacks even
after the usrsctp_close function is invoked.
Usrsctp seems to keep the asociation around until the teardown procedure
is completed. The problem is that sometimes it is not possible to
complete the shutdown procedure if the lower transport is gone like SCTP
running on top of DTLS. In such case, callbacks triggered from usrsctp
are providing an pointer address to an applications which could have
been deallocated.
To let applications know when they can safely deallocate memory
registered with an association, we are storing a destroy function which
will be used for usrsctp to notify applications when the association is
gone.
Related issues:
sctplab/usrsctp#405
sctplab/usrsctp#147

From: sctplab/usrsctp#535
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants