-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
False positive successes #141
Comments
if a bot pops a task and finds out it is already running the same task it may be worth either unclaiming it or failing it. i don't think it should cancel the previously running instance. cc @gammazero |
I don't think this is sufficient. Let me show why. With Example 1 above, I did a
In Dealbot, an infinite hang like this will eventually be caught by one of the timeouts. I think when that happens, we should try to cancel the retrieval (with API equivalent of Wdyt @willscott @gammazero? |
This should be re-tested since we now have timeouts. Also, fixing #156 would fix this, @willscott's comment noted though. |
i believe we've closed this issue / have canceling in place at this point |
I'm seeing some false successes reported by the dealbot. Pretty sure I see why this is happening: if the retrieval fails instantly, during
DealStatusNew
, Dealbot reports it as Status 3 (success).Example 1
Here is a Status==3 reported success by dealbot:
Here is what happens when you run the same retrieval on the Lotus CLI:
Example 2
Miner
f0157535
with CIDbafykbzaceahe7dt6szdh23mekkuxhangmjd3sr26rqmpyxiupte6utocyc7m6
produces the exact same result: dealbot reports Status==3, but lotus CLI shows same error as above.Short-Term Fix
We should be erroring out on
ClientEventOpen
errors, producing Status==4 and ErrorMessage==ERROR: retrieval failed: Retrieve failed: there is an active retrieval deal with peer 12D3KooWAypLydzLVAWD9aURXSvgxGdfAEuq7cUGVLkbmxK3cMLC for payload CID bafykbzacecnq5elkwi4xstanm4fttydiexkojqkhjwz3ptevqkzjj2ssljjgc (retrieval deal ID 1621630583961700674, state DealStatusOngoing) - existing deal must be cancelled before starting a new retrieval deal
Long-Term Fix
We need to detect errors of this type
there is an active retrieval deal with peer...
, extract the deal id (the number1621630583961700674
in the above case), and run the equivalent oflotus client cancel-retrieval --deal-id 1621630583961700674
.The text was updated successfully, but these errors were encountered: