Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

fix(light/response) : handle bad responses #9756

Closed
wants to merge 3 commits into from

Conversation

niklasad1
Copy link
Collaborator

@niklasad1 niklasad1 commented Oct 15, 2018

This PR adds the following:

This change the assumptions slightly i.e, before we assumed that all requests are valid and only dropped on a timeout or too many request attempts to each peer.

Might be overlapping a bit with the timeouts and I plan the clean it up that code a bit if/after this gets merged! Should be sufficient with this and timeouts only

Issues:

  • How to handle when peers disconnect, i.e should the response be removed from the bad response set? A malicious node could potentially connect and disconnect repeatedly to get a new node_id and get us to eventually drop the
  • Is the formula bad_responses > peers/2 sufficient?

/cc @Tbaut @amaurymartiny
Would be good if you could test it, seems to work well when I tested it with
https://gist.github.com/niklasad1/43690f39a07e59f92fd4a8ea9709385b

.cycle()
.skip(offset)
.take(num_peers)
{
Copy link
Collaborator Author

@niklasad1 niklasad1 Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cycle is clearer in this context and formatted the code to be easier to read

}
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verbosity, I think it helps to know which errors can occur

match err {
// This can be a malformed request or bad response but we can't determine which at the moment!
// Thus, register the response as bad and wait for more responses in order to take a decision
ValidityError::BadProof | ValidityError::Empty => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get a BadProof for an execution on an account with an insufficient balance so better to be sure when we punish peers!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this. Why would it happen now that we have #9591 (unless someone is running node, older than 2.0.7-stable/2.1.2-beta)? And how can we distinguish this from invalid proof?

@@ -18,6 +18,7 @@

use cache::Cache;
use ethcore::header::Header;
use ethcore::encoded::Header as EncodedHeader;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used remove!!

@@ -28,7 +29,7 @@ use ::request::{self as basic_request, Response};

use std::sync::Arc;

use super::{request, OnDemand, Peer, HeaderRef};
use super::{request, OnDemand, Peer, HeaderRef, ResponseError, ValidityError};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used remove!

@5chdn
Copy link
Contributor

5chdn commented Oct 15, 2018

Please label your PR :)

@5chdn 5chdn added this to the 2.2 milestone Oct 15, 2018
@5chdn 5chdn added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. labels Oct 15, 2018
@niklasad1 niklasad1 force-pushed the light/response-handling branch from 61b9c76 to a3ddcb0 Compare October 15, 2018 18:28
@niklasad1 niklasad1 added B1-patch-beta 🕷🕷 B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. B0-patch-stable 🕷 Pull request should also be back-ported to the stable branch. labels Oct 15, 2018
Copy link
Collaborator

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about this one:

  1. I thought Light client does not respond to eth_getTransactionReceipt request #9133 is about not replying to a valid request? For invalid ones we now may return "Timeout for On-demand query". Or is this about proper error message?
  2. Not sure about this formula |bad_responses| > peers / 2, since number of peers is dynamic and if we disconnect from peers this will change whether a request is considered bad or not. I don't have a better suggestion though.
  3. This policy could potentially be implemented via failsafe (Refactor on-demand queries to use failsafe-rs #9536).

{
-> Result<(), basic_request::ResponseError<self::request::Error>> {

println!("{:?}", response);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use trace! or debug! here.

ethcore/light/src/on_demand/mod.rs Show resolved Hide resolved
let mut harness = Harness::create();

// peer[0] - requester ("light")
// peer[1..9] - responders ("providers"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unmatched (

fn time_out(self) {
trace!(target: "on_demand", "Dropping a pending query (no new peer time out) at query #{}", self.query_id_history.len());
let err = self::error::ErrorKind::TimeoutOnNewPeers(self.requests.num_answered(), self.query_id_history.len());
if self.sender.send(Err(err.into())).is_err() {
debug!(target: "on_demand", "Dropped oneshot channel receiver on time out");
}
}

// The given request is determined as faulty and drop it accordingly
fn set_as_faulty_request(self, total_peers: usize, req_id: ReqId) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming this to faulty_request or bad_request to follow the same naming convention as time_out and no_response (or rename those).

// The given request is determined as faulty and drop it accordingly
fn set_as_faulty_request(self, total_peers: usize, req_id: ReqId) {
let bad_peers = self.bad_responses.len();
warn!(target: "on_demand", "The request: {} was determined as faulty, {}/{} peer(s) gave bad response", req_id, bad_peers, total_peers);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a warn! here? A user can see a similar message in the response.

This PR adds the following:
* Registers empty responses as a `bad`
* Register bad responses and only takes a decision when the majority of the
peers have come to `agreement`.
@niklasad1 niklasad1 force-pushed the light/response-handling branch from a3ddcb0 to 56779ff Compare October 16, 2018 12:14
@niklasad1
Copy link
Collaborator Author

niklasad1 commented Oct 16, 2018

@ordian

Great, comments

  1. Basically, timeout/retries are based on user configuration assumptions. This should improve throughput as long the peers and don't spam the network. We can get more accurate error message also but those are unforgently not propagated currently, I will change that but it requires to actually store the reason and select the majority of the bad responses...
  2. ...
  3. Yes, that is awesome but I think it's even more beneficial for timeouts because of the exponential backoff i.e, to limit the number of attempts on a request by exponential decaying. Currently, we just wake up every 5 seconds and retry the request (I haven't double checked the actual interval but something like that)

@niklasad1 niklasad1 added the A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. label Oct 17, 2018
@niklasad1
Copy link
Collaborator Author

Closing for now, I will migrate to failsafe first then revisit this

@niklasad1 niklasad1 closed this Oct 19, 2018
@niklasad1 niklasad1 deleted the light/response-handling branch January 17, 2019 09:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-pleasereview 🤓 Pull request needs code review. A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. B0-patch-stable 🕷 Pull request should also be back-ported to the stable branch. B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants