Distributed test suite: if `Threads.nthreads() > 1`, skip certain tests #42764

DilumAluthge · 2021-10-22T17:31:49Z

If Threads.nthreads() == 1, then we will run all tests.
If Threads.nthreads() > 1:
- If ENV["JULIA_TEST_INCLUDE_THREAD_UNSAFE"] == "true", then we will run all tests.
- Else, we will skip a few tests that are known to non-deterministically fail when there are multiple threads.

tkf

Yes, I think it's nice to mark thread-incompatible tests like this.

stdlib/Distributed/test/distributed_exec.jl

DilumAluthge · 2021-10-22T18:23:25Z

@tkf Should we rename some of the other functions to reflect the new name of the environment variable?

tkf · 2021-10-22T18:31:04Z

Yeah, maybe that's a good idea since it'd make it easy to move this to testhelper later if we want to use it elsewhere.

DilumAluthge · 2021-10-22T19:01:23Z

Any suggestions on a good naming scheme?

tkf · 2021-10-22T19:17:26Z

I think reflecting the environment variable, like you initially did, is a good approach. So run_distributed_multithreaded -> skip_if_multithreaded.

DilumAluthge · 2021-10-22T21:12:31Z

What about the naming for skipping tests?

Right now we have:

if run_distributed_multithreaded()
    @test f()
end

Maybe something like:

if !skip_this_test_if_multithreaded()
    @test f()
end

I don't love the double negative though.....

tkf · 2021-10-22T21:38:05Z

I like "skip" here since it is similar to @test_skip and so I think it outweighs the downside of double-negative. That said, I think using something like include_thread_unsafe is good too.

DilumAluthge · 2021-10-25T00:48:27Z

Marking this as a draft until I finish the renaming.

DilumAluthge · 2021-10-25T03:50:30Z

I'd like to avoid using double negatives, so I chose to go with include_thread_unsafe().

The environment variable is now named JULIA_TEST_INCLUDE_THREAD_UNSAFE.

And the function is now named include_thread_unsafe().

tkf

LGTM!

DilumAluthge · 2021-10-25T04:06:25Z

I'm going to wait for CI to finish, and then I'll check the CI logs to make sure the warning correctly shows up in the logs of the multi-threaded CI jobs.

DilumAluthge · 2021-10-25T06:03:20Z

The Buildkite logs look correct.

This PR is good to merge once CI is green.

…ts (#42764) (cherry picked from commit 0682132)

KristofferC · 2021-11-08T14:57:19Z

If Distributed needs to be tested both threaded and unthreaded it should just do so, like:

julia/test/threads.jl

Lines 5 to 11 in d39b2c0

    
           let cmd = `$(Base.julia_cmd()) --depwarn=error --rr-detach --startup-file=no threads_exec.jl` 
        
               for test_nthreads in (1, 2, 4, 4) # run once to try single-threaded mode, then try a couple times to trigger bad races 
        
                   new_env = copy(ENV) 
        
                   new_env["JULIA_NUM_THREADS"] = string(test_nthreads) 
        
                   run(pipeline(setenv(cmd, new_env), stdout = stdout, stderr = stderr)) 
        
               end 
        
           end

It should not be an external "setting" so that to fully test Julia, one needs to run the full test suite multiple times with the different settings when 99%+ of the tests execute exactly the same no matter that setting.

DilumAluthge · 2021-11-08T15:13:30Z

This is incorrect. The entire Julia test suite needs to be run both singlethreaded and multithreaded, at least on a single platform. There are bugs that only show up in one case versus the other.

Just off the top of my head, there was a test for one of the invalid sysimage code paths that passed when singlethreaded but failed when multithreaded. More specifically, we were testing that there would not be a segfault when exercising a certain code path. However, when we ran the tests with multiple threads enabled, the test segfaulted. This indicated that there was a bug in this particular code path.

That test was not part of the Distributed test suite.

DilumAluthge · 2021-11-08T15:18:14Z

when 99%+ of the tests execute exactly the same no matter that setting.

This is just not true. There are all kinds of subtle bugs in Julia that are absent when threading is disabled and are present when threading is enabled. The only way to test for these bugs is to run the test suite both with and without threading.

As I said on Slack, in order to conserve CI resources, we only do this on linux64. On linux64, we run the test suite both with and without threading. On all other architectures and operating systems, we only run the test suite once.

…ts (#42764) (cherry picked from commit 0682132)

…ts (JuliaLang#42764)

…ts (#42764) (cherry picked from commit 0682132)

…ts (JuliaLang/julia#42764) (cherry picked from commit 2cfdbec)

…ts (#42764)

DilumAluthge added parallelism Parallel or distributed computation test This change adds or pertains to unit tests backport 1.6 Change should be backported to release-1.6 backport 1.7 labels Oct 22, 2021

DilumAluthge requested review from tkf, vchuravy and vtjnash October 22, 2021 17:31

DilumAluthge mentioned this pull request Oct 22, 2021

Distributed test suite: add an additional use of the poll_while function #42758

Closed

tkf reviewed Oct 22, 2021

View reviewed changes

stdlib/Distributed/test/distributed_exec.jl Outdated Show resolved Hide resolved

KristofferC mentioned this pull request Oct 22, 2021

release-1.7: Backports for 1.7.0/1.7.0-rc3 #42765

Merged

66 tasks

vchuravy approved these changes Oct 24, 2021

View reviewed changes

DilumAluthge marked this pull request as draft October 25, 2021 00:48

DilumAluthge force-pushed the dpa/distributed-skip-when-multithreaded branch from 889f823 to e93654d Compare October 25, 2021 03:49

DilumAluthge marked this pull request as ready for review October 25, 2021 03:49

DilumAluthge requested a review from tkf October 25, 2021 03:49

tkf approved these changes Oct 25, 2021

View reviewed changes

DilumAluthge force-pushed the dpa/distributed-skip-when-multithreaded branch from e93654d to f5a4da3 Compare October 25, 2021 04:06

Distributed test suite: if Threads.nthreads() > 1, skip certain tests

2cb21db

DilumAluthge force-pushed the dpa/distributed-skip-when-multithreaded branch from f5a4da3 to 2cb21db Compare October 25, 2021 05:24

DilumAluthge added the merge me PR is reviewed. Merge when all tests are passing label Oct 25, 2021

DilumAluthge merged commit 0682132 into master Oct 25, 2021

DilumAluthge deleted the dpa/distributed-skip-when-multithreaded branch October 25, 2021 08:49

DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Oct 25, 2021

KristofferC pushed a commit that referenced this pull request Oct 28, 2021

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

532b3f8

…ts (#42764) (cherry picked from commit 0682132)

KristofferC pushed a commit that referenced this pull request Oct 29, 2021

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

bb44895

…ts (#42764) (cherry picked from commit 0682132)

KristofferC mentioned this pull request Oct 29, 2021

release-1.6: Backports for julia-1.6.4 #42147

Merged

95 tasks

DilumAluthge mentioned this pull request Nov 4, 2021

Distributed test suite: mark another test as thread-unsafe #42941

Merged

KristofferC pushed a commit that referenced this pull request Nov 11, 2021

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

93c6ea8

…ts (#42764) (cherry picked from commit 0682132)

KristofferC removed backport 1.6 Change should be backported to release-1.6 backport 1.7 labels Nov 13, 2021

LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

d3ce3fd

…ts (JuliaLang#42764)

LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

5ab8b17

…ts (JuliaLang#42764)

staticfloat pushed a commit that referenced this pull request Dec 23, 2022

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

8b890e1

…ts (#42764) (cherry picked from commit 0682132)

vchuravy pushed a commit to JuliaLang/Distributed.jl that referenced this pull request Oct 6, 2023

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

e1771e4

…ts (JuliaLang/julia#42764) (cherry picked from commit 2cfdbec)

Keno pushed a commit that referenced this pull request Jun 5, 2024

Distributed test suite: if Threads.nthreads() > 1, skip certain tes…

2cfdbec

…ts (#42764)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed test suite: if `Threads.nthreads() > 1`, skip certain tests #42764

Distributed test suite: if `Threads.nthreads() > 1`, skip certain tests #42764

DilumAluthge commented Oct 22, 2021 •

edited

Loading

tkf left a comment

DilumAluthge commented Oct 22, 2021

tkf commented Oct 22, 2021

DilumAluthge commented Oct 22, 2021

tkf commented Oct 22, 2021

DilumAluthge commented Oct 22, 2021

tkf commented Oct 22, 2021

DilumAluthge commented Oct 25, 2021

DilumAluthge commented Oct 25, 2021 •

edited

Loading

tkf left a comment

DilumAluthge commented Oct 25, 2021

DilumAluthge commented Oct 25, 2021

KristofferC commented Nov 8, 2021 •

edited

Loading

DilumAluthge commented Nov 8, 2021 •

edited

Loading

DilumAluthge commented Nov 8, 2021 •

edited

Loading

Distributed test suite: if Threads.nthreads() > 1, skip certain tests #42764

Distributed test suite: if Threads.nthreads() > 1, skip certain tests #42764

Conversation

DilumAluthge commented Oct 22, 2021 • edited Loading

tkf left a comment

Choose a reason for hiding this comment

DilumAluthge commented Oct 22, 2021

tkf commented Oct 22, 2021

DilumAluthge commented Oct 22, 2021

tkf commented Oct 22, 2021

DilumAluthge commented Oct 22, 2021

tkf commented Oct 22, 2021

DilumAluthge commented Oct 25, 2021

DilumAluthge commented Oct 25, 2021 • edited Loading

tkf left a comment

Choose a reason for hiding this comment

DilumAluthge commented Oct 25, 2021

DilumAluthge commented Oct 25, 2021

KristofferC commented Nov 8, 2021 • edited Loading

DilumAluthge commented Nov 8, 2021 • edited Loading

DilumAluthge commented Nov 8, 2021 • edited Loading

Distributed test suite: if `Threads.nthreads() > 1`, skip certain tests #42764

Distributed test suite: if `Threads.nthreads() > 1`, skip certain tests #42764

DilumAluthge commented Oct 22, 2021 •

edited

Loading

DilumAluthge commented Oct 25, 2021 •

edited

Loading

KristofferC commented Nov 8, 2021 •

edited

Loading

DilumAluthge commented Nov 8, 2021 •

edited

Loading

DilumAluthge commented Nov 8, 2021 •

edited

Loading