Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure multithreading #230

Merged
merged 11 commits into from
Mar 18, 2022
Merged

Restructure multithreading #230

merged 11 commits into from
Mar 18, 2022

Conversation

HeroicKatora
Copy link
Member

Redefine the Worker trait, which allows the chosen worker to create a
new scope. This is relevant for a newly created (and installed) rayon
worker.

The rayon thread pool is now local to the decoding step. This fixes an
issue where improper task scheduling would deadlock decoding. It's not
clear how the intended task scheduling can be reliably achieved without
the guarantee of having at least a second, free, worker thread.

Closes: #227

Redefine the Worker trait, which allows the chosen worker to create a
new scope. This is relevant for a newly created (and installed) rayon
worker.

The rayon thread pool is now local to the decoding step. This fixes an
issue where improper task scheduling would deadlock decoding. It's not
clear how the intended task scheduling can be reliably achieved without
the guarantee of having at least a second, free, worker thread.
@HeroicKatora
Copy link
Member Author

HeroicKatora commented Feb 25, 2022

Added more tests, no based off the global pool, to demonstrate/test that this does not make a quantifiable difference. As suggested to add them here. @awused Did I miss any, do you have more suggestions?

@HeroicKatora
Copy link
Member Author

HeroicKatora commented Feb 25, 2022

Okay, scratch that. This is why it was a draft. How about this: In this rayon case, we always in_place_scope the main decoder loop. (The architecture should already be there after the rework). Then, instead of creating separate, parallel worker tasks, in append_row we run the Immediate worker as a new scoped task—this is already guaranteed non-blocking, still single threaded, but I'll get to that. Both start and get_result will run directly on the immediate worker without any of dubious communication structure of the multithreaded module.

Now the critical part: We modify Worker::append_row such that multiple rows can be spawned at the same time. The rayon implementation can then create multiple tasks in its scope, which lets other threads worksteal from us. Not sure about the runtime implications of this scheme. However, this way making the rayon worker entirely non-blocking might be easier than imagined.

Moves it to a seperate module. Rayon now spawn in-place closures with no
explicit communication instead for workers. This can parallelize by
spawning multiple rows of data at the same time.
Sharing of the structure is, at the moment, purely incidental. This
has a small performance hit on the implementation of append_row
(singular) because it now also goes through the mutex phases.
@HeroicKatora
Copy link
Member Author

I've given rayon usage a complete overhaul. No more blocking at all but Mutex to modify the shared result sets.

Comment on lines +39 to +46
match prefer {
#[cfg(not(any(target_arch = "asmjs", target_arch = "wasm32")))]
#[cfg(feature = "rayon")]
PreferWorkerKind::Multithreaded => self::rayon::with_rayon(f),
#[cfg(not(any(target_arch = "asmjs", target_arch = "wasm32")))]
PreferWorkerKind::Multithreaded => self::multithreaded::with_multithreading(f),
_ => self::immediate::with_immediate(f),
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is not final, but setup to always choose rayon over multithreading over immediate. Note that we may choose any available one at runtime with this style.

src/worker/rayon.rs Outdated Show resolved Hide resolved
@HeroicKatora HeroicKatora marked this pull request as ready for review February 26, 2022 00:36
Through very careful use of borrows, where we take the right operations
in the right sequence, we can use the borrow checker to assert that all
tasks write to disjunct result slices. This avoids the need of runtime
borrow checking—which required synchronization.
@HeroicKatora HeroicKatora requested a review from fintelia March 14, 2022 07:37
@HeroicKatora HeroicKatora merged commit 66f31fd into master Mar 18, 2022
wartmanm pushed a commit to wartmanm/jpeg-decoder that referenced this pull request Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rayon feature causes deadlocks when used inside of an existing rayon threadpool
3 participants