-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate timeout in LNURL withdraw flow #1004
Conversation
Sorry for the slow review @andrei-21, can you resolve the conflicts? |
I am fine with the solution. We may also increase the timeout to 1 minute in this case which will also make the case work in most of the times. |
aeb4301
to
f23e480
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
400a53b
to
254b3bc
Compare
254b3bc
to
9386edb
Compare
data: LnUrlWithdrawSuccessData { invoice }, | ||
}, | ||
LnUrlCallbackStatus::ErrorStatus { data } => LnUrlWithdrawResult::ErrorStatus { data }, | ||
Ok(LnUrlCallbackStatus::ErrorStatus { data }) => LnUrlWithdrawResult::ErrorStatus { data }, | ||
Err(e) if e.to_string().contains("operation timed out") => LnUrlWithdrawResult::Timeout { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is the best solution. It makes this conditional on the exact error string from lnbits, which can change anytime even with a simple HTTP server upgrade.
Perhaps a more robust approach is to make the REST client timeout configurable in the SDK config? Then clients that expect to interact with slower endpoints can bump it to 60s?
Or maybe we can just bump the default from 30 to 60s?
What do you guys think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not against merging this, just wondering if there's a better way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrei-21 since it is lnbit specific (which is also against the spec) I think @ok300 has a point.
What do you think? bumping to 60 seconds will solve this issue in a reasonable way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, this message comes from reqwest
, not from lnbits (nothing comes from lnbits, that is the point ;).
I see the concern with relying on the specific text, we can do better, by actually checking if it was timeout by using is_timeout().
Regarding increasing the timeout. The problem is not that specific to lnbits, timeout can happen with any server (e.g. due to a network issue). Increasing the timeout will improve the situation, but not solve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout idea was a practical suggestion for what appears to be a temporary problem, namely that lnbits is not (yet) spec-conform in handling LNURL-withdraws.
The is_timeout
idea makes sense.
A few thoughts on the details: The right place to handle it is probably get_and_log_response
, which is where the reqwest error originates. This means the error it typically throws, ServiceConnectivityError
, would need two variants, a Generic { err }
and a Timeout
. This would propagate all the way to get_parse_and_log_response
and validate_lnurl_withdraw
, where it can result in the LnUrlWithdrawResult::Timeout
you added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrei-21 I have a general question about this.
AFAIK the timeout would not distinguish between
- a connection issue (network, etc)
- a server issue (down for maintenance, etc)
- the synchronous payment issue (lnbits trying to pay the invoice, which takes >30s, so LNURL timeout)
In the PR its written
... a proposal to propagate this timeout back to the client and let him decide if the request needs to be retried, tolerated, or failed.
Isn't it a bad idea for the client to retry a timed-out Withdraw, not knowing the actual root cause of the timeout? Couldn't the 2nd, 3rd Withdraw call lead to a 2nd and 3rd actual withdrawal, and not just a "retry"?
Or am I missing something and there's a way to tell the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The right place to handle it is probably
get_and_log_response
I agree.
AFAIK the timeout would not distinguish between
Maybe we can distinguish between connection timeout and (communication) timeout, but in general you are right. (I just noticed that the title of the PR is not correct.)
Isn't it a bad idea for the client to retry a timed-out Withdraw, not knowing the actual root cause of the timeout?
LNURL-w does not support any form of a safe retry. So in general it is dangerous to retry. In general the only safe thing to do is to show the problem to the user, but it results in a bad UX. Ignoring the error also is not bad since anyway the wallet must handle such situation (incoming payment can always fail).
Retry maybe also an option if the wallet knows more about a specific LNURL-w that it is a one-time withdraw.
Anyway as you can see the SDK cannot do better than just returning this error to the client.
I will move checking of specific network issue into get_and_log_response
.
04725ce
to
b0e3e1f
Compare
@ok300 please check out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for the PR @andrei-21
b0e3e1f
to
9c40a62
Compare
We have an issue with withdrawing from lnbits, sometimes we get
operation timed out
, but after some time the payment usually arrives. The reason for that that the SDK has 30 seconds timeout for http requests, when the wallet submits an invoice, lnbits tries to pay it synchronously. If the payment takes more than 30 seconds the SDK fails the request and returns the error, but lnbits is still paying the invoice and likely will pay.In that case the issue is on lnbits side (it should not wait for successful payment before responding to the http request), I reported them lnbits/withdraw#37. But I afraid the are not going to fix it soon and not every instance is going to upgrade quick.
Here is a proposal to propagate this timeout back to the client and let him decide if the request needs to be retried, tolerated, or failed.