Use Polars ThreadPool #1927

ibENPC · 2021-11-29T14:05:59Z

As mentioned here: #1925 (comment)

ritchie46 · 2021-11-29T15:07:30Z

🙌

alippai · 2021-11-29T15:14:24Z

Is there a flag / option to prevent implicit (accidental) pool creation in Rayon?

ritchie46 · 2021-11-29T15:22:47Z

Don't know.. It would be beneficial indeed.

alippai · 2021-11-29T16:11:19Z

Maybe calling https://docs.rs/rayon/1.0.0/rayon/struct.ThreadPoolBuilder.html#method.build_global at the tests teardown would help this (it fails if global threadpool exists)

ibENPC · 2021-11-29T16:24:21Z

Maybe calling https://docs.rs/rayon/1.0.0/rayon/struct.ThreadPoolBuilder.html#method.build_global at the tests teardown would help this (it fails if global threadpool exists)

It's a smart idea.

However it requires to run tests that shouldn't be run (in PR for example) for the moment because cargo miri would complain about the Rayon remaining thread.
We could use comments in order to make the tests available if needed but not triggered in general, I don't know.

That's a little bit tricky 🤪

alippai · 2021-11-29T16:48:48Z

This might be even better. Drop the polars threadpool in teardown and if miri complains after the test, the global threadpool was initialized. I'm not sure what issues miri has right now, but if it's about lingering threads after test run, then this might work.

ibENPC · 2021-11-29T16:52:39Z

How to drop the Polars ThreadPool ?

alippai · 2021-11-29T16:54:20Z

I didn't use it before, but maybe https://docs.rs/rayon/1.5.1/rayon/struct.ThreadPool.html#impl-Drop?

When the ThreadPool is dropped, that’s a signal for the threads it manages to terminate, they will complete executing any remaining work that you have spawned, and automatically terminate.

ibENPC · 2021-11-29T17:09:36Z

From what I understand, the ThreadPool is declared static as it can be seen here:

polars/polars/polars-core/src/lib.rs

Line 28 in fafde8d

pub static ref POOL: ThreadPool = ThreadPoolBuilder::new()

Therefore it can't be dropped.

I don't know if it is possible to change that in Polars and be able to manipulate the ThreadPool for the tests with those tricks while keeping an optimized behavior for the usual crate usage.

alippai · 2021-11-29T18:20:59Z

Would it help if instead of using the .build() we'd create the registry separately like:

lazy_static! {
    pub static ref REG: Arc<Registry> = Arc::new(Registry::new(ThreadPoolBuilder::new()
        .num_threads(
            std::env::var("POLARS_MAX_THREADS")
                .map(|s| s.parse::<usize>().expect("integer"))
                .unwrap_or_else(|_| num_cpus::get())
        ))?);
}

and then we'd be able to create the POOL as ThreadPool = ThreadPool { REG }? That way REG.terminate() would be accessible. I'm just guessing here as I'm not a Rust programmer. What do you think?

ibENPC · 2021-11-29T19:36:01Z

It is difficult for me to understand how or why it would work. I don't think so but I might be wrong.

Do not hesitate to test the code yourself to achieve a complete implementation of your idea, the Rust compiler will help you.

alippai · 2021-11-29T21:58:36Z

@ibENPC you are right, Registry is private, we can't access it. I've added rayon-rs/rayon#903, if they find it useful, we may use it in the future. Running POOL.terminate() at the end of the test makes miri happy.
I didn't dig deeper, maybe the tests using POOL might need to run sequentially (similar to this issue rust-lang/rust#47506)

ibENPC · 2021-11-30T13:34:40Z

@alippai thank you for the work, the problem is clearer.

If we run the tests sequentially we might find a solution by using an unsafe modification of POOL or another unsafe method. Otherwise it seems to imply a random game with data races 🤪

@ritchie46 Is it a problem to use -Zmiri-disable-leaks as an option to cargo miri for the checks ? Because there are many functions where we can use Rayon, and the miri test as is forces only 2 distinct choices:

Avoiding to parallelize --> best potential performance not achieved.
Avoiding to run tests --> best potential robustness not achieved.

I can open an issue if you think that the discussion is worth it.

ritchie46 · 2021-11-30T16:01:46Z

IMO it doesn't matter that miri doesn't run for all tests. If it can do that one day, great. But I don't want to reduce perf, so we can run a test.

Use Polars ThreadPool

3c75eab

ritchie46 merged commit 6dd9ded into pola-rs:master Nov 29, 2021

alippai mentioned this pull request Nov 29, 2021

Ability to terminate thread-pool upon shutdown rayon-rs/rayon#903

Open

ibENPC mentioned this pull request Dec 1, 2021

Ignore Miri leaks #1946

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Polars ThreadPool #1927

Use Polars ThreadPool #1927

ibENPC commented Nov 29, 2021

ritchie46 commented Nov 29, 2021

alippai commented Nov 29, 2021

ritchie46 commented Nov 29, 2021

alippai commented Nov 29, 2021

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021 •

edited

Loading

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021 •

edited

Loading

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021

ibENPC commented Nov 30, 2021

ritchie46 commented Nov 30, 2021

Use Polars ThreadPool #1927

Use Polars ThreadPool #1927

Conversation

ibENPC commented Nov 29, 2021

ritchie46 commented Nov 29, 2021

alippai commented Nov 29, 2021

ritchie46 commented Nov 29, 2021

alippai commented Nov 29, 2021

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021 • edited Loading

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021 • edited Loading

ibENPC commented Nov 29, 2021

alippai commented Nov 29, 2021

ibENPC commented Nov 30, 2021

ritchie46 commented Nov 30, 2021

alippai commented Nov 29, 2021 •

edited

Loading

alippai commented Nov 29, 2021 •

edited

Loading