Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Router sends requests to dead instances #480

Closed
Serpentian opened this issue Aug 2, 2024 · 1 comment
Closed

Router sends requests to dead instances #480

Serpentian opened this issue Aug 2, 2024 · 1 comment
Labels

Comments

@Serpentian
Copy link
Contributor

Serpentian commented Aug 2, 2024

This is the critical issue and must be fixed as soon as possible. When instance is dead for some reason (e.g. tx thread is 100% loaded, while true do end on instance), vshard continues to send requests to it, which fails with TimeOut error. This happens, as connection doesn't die, when tx thread is busy (wait_connected returns true). Manually disabling the instance always fixes the problem, erroneous requests disappear.

while r do
if r:is_connected() and (not prefer_replica or r ~= master) and
replica_check_backoff(r, now) then
return r
end
r = r.next_by_priority
end

Currently router requests fail way too often, which affects users in mission critical projects. Vshard must try to minimize the number of timed out errors (and probably any other errors). We should consider the following approaches:

As a follow up we may introduce the following ticket, but only if smb asks, since it requires user intervention and writing triggers:

@Serpentian Serpentian added bug Something isn't working router complicated labels Aug 2, 2024
@Serpentian Serpentian self-assigned this Aug 2, 2024
@Serpentian Serpentian changed the title vshard sends requests to dead instances Router sends requests to dead instances Aug 2, 2024
@sergepetrenko
Copy link
Contributor

There's also a ticket to make net.box track connection state: tarantool/tarantool#9563

But if we choose to fix that ticket, it'll only go to new tarantool releases, leaving vshard installations on old tarantool versions still affected by this bug.

So I vote for fixing it separately in vshard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants