If I restart the query-frontend while queriers are running then we can't achieve `-querier.max-concurrent`

**Describe the bug**
If query-frontend and querier are restarted at the same time, or query-frontend is restarted while queriers are running, then `-querier.max-concurrent` cannot be achieved. 

**To Reproduce**
1. Restart just queriers by doing a rollout restart, do not restart query-frontend
1. Make sure your system is in steady steady state and you can achieve `-querier.max-concurrent`
1. Restart query-frontend
1. hammer all your query frontends with expensive queries and observe `-querier.max-concurrent` is no longer achievable.

**Expected behavior**
Should still be able to achieve `-querier.max-concurrent`.

**Environment:**
 We are running on k8s.  

**Storage Engine**
- [X] Blocks
- [ ] Chunks

**Additional Context**
My suspicion is because in the [`worker.go` `AddressRemoved` does not call `resetConcurrency()`](https://github.com/cortexproject/cortex/blob/a4bf1035478641626fcbdd5fd12325c08a2bba76/pkg/querier/worker/worker.go#L205)

Imagine the following cases: 
* You have 1 querier and 3 query-frontend (fe1, fe2, and fe3)
* your `-querier.max-concurrent` is set to 8
* So, each query frontend have at least 2 connection to the queriers. Because 8 is not divisible by 3, and 8 modulo 3 is 2, so there will be extra connection between fe1 and fe2 to the querier. 
* So, fe1 has 3 connection to querier, fe2 has 3, and fe3 has 2. 

Now, we restart the query-frontend, and the DNS Watch on the querier (`worker.go`) will get to work and start adding and removing addresses. 
* During deployment we will have 6 query-frontends fe1 to fe6 because we spin up new pods first
* So you get into a stat where fe1 has 2 connection to querier, fe2 has 2, fe3 has 1, fe4 has1, fe5 has 1, and fe6 has 1
* Then we will spin down the old pod, fe1 to fe3. 
* Because the `AddressRemoved` method does not call `resetConcurrency()` to recalculate the load distribution, we end up having fe4 has 1 connection to querier, fe5 has 1, and fe6 has 1. Which is just 3 instead of 8.



Below is a graph showing achievement of `-querier.max-concurrent=8` during different phases.

![Grafana](https://user-images.githubusercontent.com/787449/127723147-f1a79a8d-135f-4484-b508-2a3da5a3a2e8.png)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

If I restart the query-frontend while queriers are running then we can't achieve `-querier.max-concurrent` #4391

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

If I restart the query-frontend while queriers are running then we can't achieve -querier.max-concurrent #4391

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

If I restart the query-frontend while queriers are running then we can't achieve `-querier.max-concurrent` #4391