Changes to the background loading #425

paulfd · 2020-09-18T16:21:10Z

The goal was to reduce the number of background threads in big sessions (multiple instances) and avoid loading the same file alot on repeated notes (e.g. drum tracks).

All instances use a common thread pool
Each instance has a dispatching job that starts background loader, and a garbage job that clears data that has been unused for some time
The "FilePromise" object has disappeared. Upon request, voices get a very thin reference-counting handler to the file data. When the handler is released the reference count is decreased and a "last used" timestamp is added to the file.
The background file loading job checks i) if the file is already loaded, or ii) if some other thread is already loading it. If any case is validated, it exits assuming the background loading is already happening.
The garbage job has to be triggered regularly by e.g. the synth. In the RT thread, the triggerGarbageCollection() method will check the list of previously loaded files to see if any has not been used for a while. If so, the memory is set to be discarded by a background thread, otherwise it'll wait for the next ping.

Only tested on Linux for now.

src/external/threadpool/ThreadPool.h

src/sfizz/FilePool.cpp

src/sfizz/Config.h

jpcima · 2020-09-19T11:53:28Z

Some thoughts on this problem in general:

loading can distinguish itself in 2 kinds of tasks
- input from disk (not including decoding)
- processing disk data: decoding and upsampling
given that disk loading is IO bound, there is little or no benefit to multitasking this part
it may as well be one single thread which cycles over file handles and delivers required data chunks to processing threads.
The IO thread passes non-decoded chunks to the CPU threads, CPU processed the decoding.

When IO and CPU work are decoupled fully, such that reading calls don't involve decoding (as opposed to direct sf_read), disk IO can be sollicited at full throughput. The mechanism might possibly be implemented using features of SF_VIRTUAL_IO.

paulfd · 2020-09-21T07:40:08Z

Yes I agree, I'll check a bit into libsndfile virtual IO to see if there is a reasonable architecture. It might be hard to have something fully async though. I mean that if the virtual IO is a "blocking" one for the background threads, we might as well let the OS schedule them and their IO requests. The OS is the right place to make these decisions since it has all the required information for proper scheduling.

paulfd · 2020-09-21T18:21:54Z

Thinking about it a bit more I wonder if the "better" problem to solve wouldn't be to have chunked reading in the thread pool too. In a way, the background loading could go from a single task that opens -> reads and decode data, to a succession of smallers tasks like open the file, read a chunk, read a chunk, read a chunk, ... potentially giving back the token after each subtask is done, thereby allowing all file loading to progress at a similar rate.

jpcima · 2020-09-23T15:58:58Z

f41ae6d means that tasks don't get served while there are as many sound files as processors enqueued
if these are heavy tasks, smaller tasks are in wait until some of these complete.

paulfd · 2020-09-23T16:16:43Z

You are right, although this always was the case I think. If we try to answer this question, we're basically writing a scheduler. Would it be smart(er) to let the OS do this scheduling and spawn a new thread for each file load?

jpcima · 2020-09-23T16:31:09Z

The creation of threads is usually known as a heavy operation; now this remains a thing to measure, but certainly there's a reason that thread pools and coroutine are commonly used patterns.

I thought it's ok to spawn a fixed thread pool with moderately high count. (ie. 32, 64, ..)

Imo, we can preferably use some sort of scheduling, that is made to divide file loading into smaller subtasks and enqueue them FIFO by an order of urgency. (high frequency notes qualify as more urgent) It does not get in the feet of the OS scheduler.
The problem is divided by subdividing files into fragments to read of some block size.
This can optimize the problem for latency, but throughput would be worse overall, for caching reasons.

All this is intuition, in the end we need a real benchmark of several strategies, with a measurement of throughput and latency.

jpcima · 2020-09-23T16:31:49Z

note: can we have an innocent look at how many threads ARIA has?

paulfd · 2020-09-23T16:43:50Z

The creation of threads is usually known as a heavy operation; now this remains a thing to measure, but certainly there's a reason that thread pools and coroutine are commonly used patterns.

Yes, but as you said maybe we should measure. In the grand scheme of things it might actually be acceptable. And we have builtin-tooling to track this sort of things.

I thought it's ok to spawn a fixed thread pool with moderately high count. (ie. 32, 64, ..)

This could be a good middle ground. I'll test it on more constrained platforms though, so that it does not bring them to their knees.

Imo, we can preferably use some sort of scheduling, that is made to divide file loading into smaller subtasks and enqueue them FIFO by an order of urgency. (high frequency notes qualify as more urgent) It does not get in the feet of the OS scheduler.

Yes this is what I had in mind, although I'm not sure what you mean by high frequency (in terms of pitch?) in this context.

The problem is divided by subdividing files into fragments to read of some block size.
This can optimize the problem for latency, but throughput would be worse overall, for caching reasons.
All this is intuition, in the end we need a real benchmark of several strategies, with a measurement of throughput and latency.

Do you think this should also be part of this PR ? My main goal there was more to get rid of the multiple queues, the per-instance threads, as well as avoiding to repeatedly load the same files on quick succession. There is more to do to improve on this front for sure.

paulfd · 2020-09-23T16:48:35Z

In particular, updating sfizz-render to "simulate" live playing (raising the thread priority and disabling freewheeling) would be helpful in evaluating different alternatives.

jpcima · 2020-09-23T17:03:44Z

Do you think this should also be part of this PR ?

No.

Yes this is what I had in mind, although I'm not sure what you mean by high frequency (in terms of pitch?) in this context.

If we have a pair of sample A and B, and B is to play at speed 2x compared to A, a more optimized delivery is going to be

A, B, B, A, B, B etc...

paulfd · 2020-09-23T20:14:02Z

I see, yes for sure. For this PR I'll try it on Android and raspi with different numbers of threads.

paulfd · 2020-09-24T09:35:10Z

On windows with the concurrency - 2 threads, reading from a spinning drive, it works "OK". There's underruns if the preload size is too low, but otherwise drums can hammer away :) On an SSD drive it's working well with the default preload size.

Raspi next.

paulfd · 2020-09-27T16:57:05Z

Ok it works on the raspi 3 well enough for both LV2 and VST3. The performance of sfizz overall is not awesome but the background loader works. The rest should get better over time.

…r of background threads in big sessions (multiple instances) and avoid loading the same file alot on repeated notes (e.g. drum tracks).- All instances use a common thread pool- Each instance has a dispatching job that starts background loader, and a garbage job that clears data that has been unused for some time- The "FilePromise" object has disappeared. Upon request, Voices get a very thin reference-counting handler to the file data. When the handler is released the reference count is decreased and a "last used" timestamp is added to the file.- The background file loading job checks i) if the file is already loaded, or ii) if some other thread is already loading it. If any case is validated, it exits assuming the background loading is already happening.- The garbage job has to be triggered regularly by e.g. the synth. In the RT thread the triggerGarbageCollection() method will check the list of previously loaded file Ids to see if any has not been used for a while. If so, the memory is set to be discarded by a background thread, otherwise it'll wait for the next ping.Only tested on Linux for now.

Honestly a stupid queue would be fine at this point

paulfd marked this pull request as draft September 18, 2020 16:21

jpcima reviewed Sep 19, 2020

View reviewed changes

src/external/threadpool/ThreadPool.h Outdated Show resolved Hide resolved

jpcima reviewed Sep 19, 2020

View reviewed changes

src/sfizz/FilePool.cpp Outdated Show resolved Hide resolved

jpcima reviewed Sep 19, 2020

View reviewed changes

src/sfizz/FilePool.cpp Outdated Show resolved Hide resolved

jpcima reviewed Sep 19, 2020

View reviewed changes

src/sfizz/Config.h Show resolved Hide resolved

paulfd closed this Sep 21, 2020

paulfd reopened this Sep 21, 2020

paulfd marked this pull request as ready for review September 27, 2020 16:57

paulfd requested a review from jpcima September 27, 2020 16:57

jpcima approved these changes Oct 4, 2020

View reviewed changes

paulfd added 7 commits October 7, 2020 00:37

The dispatcher was not spinning

e693e82

Add constructors to the QueuedFileData

9e0a1df

Honestly a stupid queue would be fine at this point

gcc 4.9 idiosyncrasies

a505b9f

Add a volatile on the stop flag in the thread pool

d35ff4e

Unused variable and a wrong name

1e2ae6b

Adapt the thread pool to the number of concurrent threads

a0ad3b9

paulfd force-pushed the simpler-file-loading branch from a68d856 to a0ad3b9 Compare October 6, 2020 22:37

Remove unused variable

b78fa39

paulfd merged commit 2fc6aab into sfztools:develop Oct 7, 2020

paulfd deleted the simpler-file-loading branch October 7, 2020 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to the background loading #425

Changes to the background loading #425

paulfd commented Sep 18, 2020

jpcima commented Sep 19, 2020 •

edited

Loading

paulfd commented Sep 21, 2020 •

edited

Loading

paulfd commented Sep 21, 2020

jpcima commented Sep 23, 2020

paulfd commented Sep 23, 2020

jpcima commented Sep 23, 2020

jpcima commented Sep 23, 2020

paulfd commented Sep 23, 2020

paulfd commented Sep 23, 2020

jpcima commented Sep 23, 2020

paulfd commented Sep 23, 2020

paulfd commented Sep 24, 2020

paulfd commented Sep 27, 2020 •

edited

Loading

Changes to the background loading #425

Changes to the background loading #425

Conversation

paulfd commented Sep 18, 2020

jpcima commented Sep 19, 2020 • edited Loading

paulfd commented Sep 21, 2020 • edited Loading

paulfd commented Sep 21, 2020

jpcima commented Sep 23, 2020

paulfd commented Sep 23, 2020

jpcima commented Sep 23, 2020

jpcima commented Sep 23, 2020

paulfd commented Sep 23, 2020

paulfd commented Sep 23, 2020

jpcima commented Sep 23, 2020

paulfd commented Sep 23, 2020

paulfd commented Sep 24, 2020

paulfd commented Sep 27, 2020 • edited Loading

jpcima commented Sep 19, 2020 •

edited

Loading

paulfd commented Sep 21, 2020 •

edited

Loading

paulfd commented Sep 27, 2020 •

edited

Loading