fix(DeadlockTest): Handle draining of closed channel, speed up test. #1957

billprovince · 2023-05-30T17:43:42Z

Fixes DGRAPHCORE-153

Problem

We have had frequent problems with the TestPublisherDeadlockTest. Every once in a while, we would witness timeouts on the test itself. I witnessed that in at least one case, the problem could be attributed to a channel being "drained" but having incorrect handling in the case where the channel is closed. While this was not a majority root-cause for this issue, it is nevertheless something we should correctly handle. Additionally, we find that these tests can take quite a while -- most of the time simply waiting for go-routines to finish their time.Sleep(...) invocations. The use of time.Sleep to handle timing issues is typically a sign that the messaging has not been fully thought through.

Solution

For the issue with the "draining" not handling closed channels, the process is simple: we just add a check for the channel already being closed during the drain process. For the issue of the overuse of time.Sleep, and long-running invocations, I replaced the time.Sleep invocations with WaitGroup instances. After these changes, I am unable to prove that the full issue has been resolved, but I find that I have been unable to reproduce the original failures after these changes. Invoking go test -race -v --count=1000 -run TestPublisherDeadlock now completes in roughly 80 seconds (for the 1000 tests) with each test taking roughly 0.08 seconds as opposed to roughly 20 seconds (and multiple hours required to run 1000 tests). Moreover, on the 10 or so trials I have made to reproduced the original problem after this change, I get 100% test success.

Handle draining of closed channel.

3a991ec

billprovince requested review from akon-dey, joshua-goldstein and skrdgraph as code owners May 30, 2023 17:43

mangalaman93 approved these changes May 30, 2023

View reviewed changes

billprovince merged commit f9b9e4d into main May 30, 2023

billprovince deleted the HandleDrainingClosedChannel branch May 30, 2023 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(DeadlockTest): Handle draining of closed channel, speed up test. #1957

fix(DeadlockTest): Handle draining of closed channel, speed up test. #1957

billprovince commented May 30, 2023 •

edited by mangalaman93

Loading

fix(DeadlockTest): Handle draining of closed channel, speed up test. #1957

fix(DeadlockTest): Handle draining of closed channel, speed up test. #1957

Conversation

billprovince commented May 30, 2023 • edited by mangalaman93 Loading

Problem

Solution

billprovince commented May 30, 2023 •

edited by mangalaman93

Loading