Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix UTXO rpc clients dead lock on balance/unspent requests. #887

Merged
merged 14 commits into from
Apr 7, 2021

Conversation

artemii235
Copy link
Member

#846

The problem was in concurrent requests handling. Electrum/Daemon could be easily spammed with a lot of balance/unspents requests when many swaps are started or many orders are placed at once. To avoid this I added the feature to send request only once so all other concurrent callers receive the same response once the request is finished. There was a race condition bug in this implementation - it could be possible that there is no active request at the moment, but the list_unspent_in_progress flag was true, which locked the call forever.

@sergeyboyko0791
There are a couple of unwraps in ConcurrentRequestMap::wrap_request that should not panic, but I don't like them anyway 🙂 Maybe you will have an idea on how to get rid of them?

@cipig @Milerius @tonymorony Could you please give this a test in both Electrum and Native modes? The easiest way to reproduce the issue was to make a lot of balance requests concurrently in the loop and then disconnect the network for a while.

sergeyboyko0791
sergeyboyko0791 previously approved these changes Apr 2, 2021
Copy link

@sergeyboyko0791 sergeyboyko0791 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, it looks leaner, cleaner, and better than the previous concurrency map implementation.
I've left a few comments, but you can ignore them, cause they don't affect code execution.

Perhaps, these unwraps can be replaced by expect("description") :D

mm2src/coins/utxo.rs Outdated Show resolved Hide resolved
mm2src/coins/utxo/rpc_clients.rs Outdated Show resolved Hide resolved
mm2src/coins/utxo/rpc_clients.rs Outdated Show resolved Hide resolved
We can have a race condition if 2 concurrent setprice calls are made with cancel_previous: true.
So both orders will be created while it should create only 1 of them.
@artemii235
Copy link
Member Author

@sergeyboyko0791 Thanks for your review! I've made the changes, could you recheck, please?

tonymorony
tonymorony previously approved these changes Apr 2, 2021
Copy link

@tonymorony tonymorony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

  1. I've reproduced the problem on the old version (manipulations with the connection not needed on my side)

if run this script in a single thread mm2 returns answer fine, if run this script in two threads (e.g. second terminal tab) mm2 stops to response until the restart

balances_spam.sh 
#!/bin/bash
while :
do
  curl --url "http://127.0.0.1:7783" --data "{\"userpass\":\"$userpass\",\"method\":\"my_balance\",\"coin\":\"RICK\"}"
  curl --url "http://127.0.0.1:7783" --data "{\"userpass\":\"$userpass\",\"method\":\"my_balance\",\"coin\":\"MORTY\"}"
done
  1. on mm2.1-fix-api-deadlock such spam script works fine in multiple threads

sergeyboyko0791
sergeyboyko0791 previously approved these changes Apr 2, 2021
Copy link

@sergeyboyko0791 sergeyboyko0791 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice!

cipig
cipig previously approved these changes Apr 2, 2021
Copy link
Member

@cipig cipig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mm2 nodes with hundreds of orders running fine since 24h, previous problems are gone

@cipig
Copy link
Member

cipig commented Apr 2, 2021

i fear that 731e2d2 breaks it, i get lots of timeouts after updating to it
5b20ad1 works fine, i was using that for 24h... going back to it solves the timeouts

@artemii235 artemii235 dismissed stale reviews from cipig, tonymorony, and sergeyboyko0791 via 600cbb3 April 3, 2021 02:50
@artemii235
Copy link
Member Author

i fear that 731e2d2 breaks it, i get lots of timeouts after updating to it

I've done it because Slyris reported a race condition: when he called setprice with cancel_previous:true for the same pair concurrently 2 orders were created instead of 1. We should avoid holding the my_maker_orders mutex during network requests so I moved the balance check code upper. However, it cancels the fix made for #794 because cancellation will happen after the balance check. What should we do about this? 🙁

@artemii235
Copy link
Member Author

@cipig Could you test the latest commit, please? I've added explicit orders cancellation in case of errors.

Copy link
Member

@cipig cipig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works now, no more timeouts, thanks for the fix

Copy link

@Milerius Milerius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on OSX with my bot, previous order are correctly cancelled now.

@artemii235
Copy link
Member Author

Great, thanks to everyone for the reviews!

@artemii235 artemii235 merged commit 6c23530 into mm2.1 Apr 7, 2021
@artemii235 artemii235 deleted the mm2.1-fix-api-deadlock branch April 7, 2021 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants