-
Notifications
You must be signed in to change notification settings - Fork 981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
firehose connection loadbalance #4083
Conversation
1710c0a
to
cce4da9
Compare
cce4da9
to
fef1e62
Compare
docker/docker-compose.yml
Outdated
volumes: | ||
- ./data/postgres:/var/lib/postgresql/data | ||
# volumes: | ||
# - ./data/postgres:/var/lib/postgresql/data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change to docker-compose
?
graph/src/firehose/endpoints.rs
Outdated
@@ -75,8 +73,7 @@ impl FirehoseEndpoint { | |||
// Timeout on each request, so the timeout to estabilish each 'Blocks' stream. | |||
.timeout(Duration::from_secs(120)); | |||
|
|||
// Load balancing on a same endpoint is useful because it creates a connection pool. | |||
let channel = Channel::balance_list(iter::repeat(endpoint).take(conn_pool_size as usize)); | |||
let channel = Channel::balance_list(vec![endpoint].into_iter()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this let channel = endpoint.connect_lazy();
as it used to be.
graph/src/firehose/endpoints.rs
Outdated
@@ -267,10 +264,12 @@ impl FirehoseEndpoints { | |||
self.0.len() | |||
} | |||
|
|||
// selects the FirehoseEndpoint with the lest amount of references, which will help with spliting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// selects the FirehoseEndpoint with the lest amount of references, which will help with spliting | |
// selects the FirehoseEndpoint with the least amount of references, which will help with splitting |
graph/src/firehose/endpoints.rs
Outdated
self.0.iter().choose(&mut rng) | ||
self.0 | ||
.iter() | ||
.min_by(|x, y| Arc::strong_count(x).cmp(&Arc::strong_count(y))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
min_by_key
is nicer.
graph/src/firehose/endpoints.rs
Outdated
@@ -267,10 +264,12 @@ impl FirehoseEndpoints { | |||
self.0.len() | |||
} | |||
|
|||
// selects the FirehoseEndpoint with the lest amount of references, which will help with spliting | |||
// the load naively across the entire list. | |||
pub fn random(&self) -> Option<&Arc<FirehoseEndpoint>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One aspect that concerns me is a silent hang if we ever hit the 100 stream limit per connection limit. Perhaps a simple way to make the error explicit would be to add a
const SUBGRAPHS_PER_CONN: usize = 100;
And return an error if the ref count reaches this number. Or even better auto-scale based on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will address this on a separate PR as this could benefit from some metrics in order to detect a stall
fef1e62
to
b45207f
Compare
.min_by_key(|x| Arc::strong_count(x)) | ||
.ok_or(anyhow!("no available firehose endpoints"))?; | ||
if Arc::strong_count(endpoint) > SUBGRAPHS_PER_CONN { | ||
return Err(anyhow!("all connections saturated with {} connections, increase the firehose conn_pool_size", SUBGRAPHS_PER_CONN)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work with the actionable error message!
Oh and lets bump the default to 20. |
37ce27d
to
03c2911
Compare
abb00a5
to
dc91e76
Compare
Fixes #3879