Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VDiff: vttablet panics for vdiffs with multiple tables under heavy load #14346

Closed
rohit-nayak-ps opened this issue Oct 24, 2023 · 0 comments · Fixed by #14345
Closed

VDiff: vttablet panics for vdiffs with multiple tables under heavy load #14346

rohit-nayak-ps opened this issue Oct 24, 2023 · 0 comments · Fixed by #14345

Comments

@rohit-nayak-ps
Copy link
Contributor

rohit-nayak-ps commented Oct 24, 2023

Overview of the Issue

There is a race condition in VDiff when a workflow has multiple tables. Consider a workflow with two tables.

  1. The diff of the first table is deemed done, because too many rows have errors, for example. The corresponding shard streamers are waiting on cancellation of the context passed into vdiff (which times out on action_timeout (1 hour) or on the controller's context cancellation which happens when the entire vdiff (for all tables) is done.
  2. The second table's diff starts. This does creates a new channel source.result = make(chan *sqltypes.Result, 1) for the shard streamers to communicate their rows on.
  3. The second table's diff ends. The corresponding shard streamers for the second table are also waiting on the same conditions as the first table.
  4. Since all tables have completed vdiff the controller's context is done and both sets of shard streamers try to close(source.result), which is pointing to the same channel due to the race.
  5. This results in a vttablet panic for the second streamer.

The race which I encountered happened under conditions of load where there was a load simulator running DMLs at ~1K QPS.

Reproduction Steps

Happened while working on #14345

Binary Version

v19
@rohit-nayak-ps rohit-nayak-ps self-assigned this Oct 24, 2023
@rohit-nayak-ps rohit-nayak-ps changed the title VDiff: panic for vdiffs with multiple tables under heavy load VDiff: vttablet panics for vdiffs with multiple tables under heavy load Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
1 participant