Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected timeouts on canceled connecting call #41

Open
Koshub opened this issue May 20, 2021 · 6 comments
Open

Unexpected timeouts on canceled connecting call #41

Koshub opened this issue May 20, 2021 · 6 comments

Comments

@Koshub
Copy link
Contributor

Koshub commented May 20, 2021

Hello. I am using baresip v 1.0.0 as a library (on iOS so with kqueue). My use cases require to stop baresip main loop and restart when required. I have found the edge case when on finishing the call which is trying to connect and trying to shut down baresip completely I have some delay which may take about up to 20 seconds or so. So the next call I can start only after the main loop finished and I start the new one. The sequence is something like this:

  • start baresip main loop
  • ua_connect
  • the call is trying to connect
  • ua_hangup
  • ua_stop_all(0) inside mqueue callback
  • here I get a long delay until re_main returns

I do not get a delay in case the call was connected and then finished using ua_hangup. So probably there are some issues only when a call can not be connected and start waiting for something to happen.

Are there any timeout settings or so?

@alfredh
Copy link
Collaborator

alfredh commented May 22, 2021

the re_main loop will process N filedescriptors, and the timeout is the shortest time for all timers.
if the number of filescriptors is zero, the re_main will block forever.

it is important that the FD set is updated from the same threads as the re_main loop,
i.e. via fd_listen handler or timer handler. otherwise the FD set will not be updated.

in the past I have used mqueue to execute code in the context of the re_main loop thread.
you can send a dummy event via mqueue to "wakeup" the re_main thread.

PS: the general advice is to have one thread running re_main loop all the time.
it will sleep when there is nothing to do.
It is not an advice to keep starting/stopping the re_main thread.

@alfredh alfredh transferred this issue from baresip/baresip May 22, 2021
@Koshub
Copy link
Contributor Author

Koshub commented May 22, 2021

@alfredh thank you for your reply. I use one thread to run re_main. I also use mqueue to run ua_stop_all(0) inside that thread. So it should work. And it works as expected when the call is normally operated - connected, answered, closed. But in case the call can't connect it waits and after I send ua_stop_all(0) re_main does nothing until some event. That event is just emitted by some logic after a too long time. So what I am trying to find out is what exactly that event is and how to reduces its timeout. I know about the inability to connect is that the other part is set behind a firewall. So probably this is some edge case and in this case there is some delay that does not appear in usual scenarios.

Do you have any ideas what could cause this behavior? Or should it be enough to send some dummy event after ua_stop_all(0)?

@Koshub
Copy link
Contributor Author

Koshub commented Jun 14, 2021

@alfredh BTW. This issue was transferred to baresip-ios. This is not specific for iOS, it is about integration with OS. I suppose you may find the same issue on any system which uses kqueue because re_main just stucks in fd_pull waiting for kevent in the use case I described above.

@sreimers
Copy link
Member

kqueue because re_main just stucks in fd_pull

this is at first the normal behavior, fd_poll waits for a event by a file descriptor or until the next timer expires. Is there a difference if you force ua_stop_all with ua_stop_all(true)?

@Koshub
Copy link
Contributor Author

Koshub commented Jun 15, 2021

@sreimers Sorry for the delay in the answer. And again thanks for support. It's really hard to test to understand the truth. It sounds like ua_stop_all(1) makes things more delayed. But I am not sure.

this is at first the normal behavior, fd_poll waits for a event by a file descriptor or until the next timer expires

I understand that. I have just found some configuration here:

enum {	SIP_T1 =  500,	SIP_T2 = 4000,	SIP_T4 = 5000,};

Is it possible timers using these values could work too long to fire appropriate events (say, here) ? Could I configure re_main exit timeout using these values or it is better to leave them as is?

@alfredh
Copy link
Collaborator

alfredh commented Jun 19, 2021

can you try to debug the timers and see which timers are active when this happens ?

please note that when the SIP register client is unregistering, it will wait for 200 ok
from the server. if the response does not arrive, at will timeout after N seconds.

my advice is to add lot of debug printf and try to see where the delay comes from :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants