Skip to content
This repository has been archived by the owner on Sep 1, 2022. It is now read-only.

Client: SIGABRT: Resource deadlock avoided #374

Closed
1 task done
anonimal opened this issue Sep 20, 2016 · 6 comments
Closed
1 task done

Client: SIGABRT: Resource deadlock avoided #374

anonimal opened this issue Sep 20, 2016 · 6 comments

Comments

@anonimal
Copy link
Collaborator

By submitting this issue, I confirm the following:

  • I have read and understood the contributor guide.
  • I have checked that the issue I am reporting can be replicated or that the feature I am suggesting is not present.
  • I have checked opened or recently closed pull requests for existing solutions/implementations to my issue/suggestion.

Place an X inside the bracket to confirm

  • I confirm.

Occurred in branch ssu before merging #372, but this would also effect d296c49 regardless (I just got back, so now opening ticket).

Deadlock only occurred once and I've never seen kovri deadlock before (it may be because of all the new needed SSU activity?).

Backtrace attached:
0_bt-frame-inspect.txt
1_thread-apply-all-bt.txt
2_thread-apply-all-bt-full.txt

If this is more than a rare event, I'll paste better output. Regardless, we should review thread handling and (if possible) implement deadlock prevention (or at least catch things better).

Notes:

  1. Take note of ClientDestination threads in addition to thread 6
  2. tunnels.conf client IRC tunnel and server local tunnel were employed
  3. There was no active end-user input at the time of deadlock

Exception notes:

Exceptions
std::system_error if an error occurs.
Error Conditions
resource_deadlock_would_occur if this->get_id() == std::this_thread::get_id() (deadlock detected)

@anonimal
Copy link
Collaborator Author

anonimal commented Sep 21, 2016

Good news? This is reproducible. 24 hours 20 minutes online, Ubuntu 16.04:

terminate called after throwing an instance of 'std::system_error'
  what():  Resource deadlock avoided

Thread 6 "kovri" received signal SIGABRT, Aborted.

I'll be mostly away for this week but will leave gdb idling for when I return. If anyone else can reproduce, please comment!

Edit: JFTR: same issue, same area, same thread, same (similar) output.

@anonimal
Copy link
Collaborator Author

i2p-relay is built against 99666cf and has been stable for nearly 2 weeks now (and apparently hasn't segfaulted). @fluffypony also says that the relay is running on 16.04.

Its probably safe to assume the deadlock is related to all the new tunnel activity brought with the ssu merge.

Adding a milestone now since this issue is reproducible.

@anonimal anonimal added this to the 0.1.0-alpha milestone Sep 21, 2016
@anonimal
Copy link
Collaborator Author

anonimal commented Oct 1, 2016

Note: after a few days of running only NTCP (--enable-ssu 0), I cannot reproduce the deadlock.

@anonimal
Copy link
Collaborator Author

anonimal commented Oct 5, 2016

7 days of NTCP only, still no deadlock. On a separate instance, 3 days of running default simultaneous NTCP/SSU, still no deadlock.

@anonimal
Copy link
Collaborator Author

Over 1 month and 10 days of both NTCP/SSU on Ubuntu 16.04.1: no deadlock. Almost two weeks of FreeBSD + OSX 10/11/12 + Ubuntu 32/64 instances: no deadlock.

As such, this is proven to be not always reproducible and may only be triggered by certain peers (though I have yet to look more into the code and backtrace).

For an alpha release, I believe it's safe to remove the blocker label and replace as major. Replacing as major.

@anonimal anonimal added major and removed blocker labels Nov 13, 2016
@anonimal anonimal self-assigned this Nov 27, 2016
@anonimal
Copy link
Collaborator Author

Looking at the backtrace, this looks resolved by #466/#342. I also haven't been able to reproduce this issue on any platform running 24x7 instances.

Closing, though we can reopen this issue if needed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant