Implement data prefetch #194

jpsamaroo · 2021-01-21T19:39:01Z

If we send input data to a node (that we know is going to execute a thunk in the future) before its thunk is ready to execute, and cache it on that node, then we can execute that thunk much quicker (assuming network transfers are fully asynchronous and don't impede other thunk executions). We should allow the scheduler to do a small amount of this "prefetching" of data when it has a large amount of data associated with a thunk.

We'll need to be able to pre-allocate a processor for each thunk, implement a worker-local cache of received inputs, and make thunks check this cache for their inputs before moving them. We should also start modeling the memory availability for a given worker, and the memory costs of inputs and estimated max memory allocations of each thunk.

jpsamaroo added scheduler data movement labels Jan 21, 2021

jpsamaroo linked a pull request Jan 28, 2021 that will close this issue

Add basic data prefetch #199

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement data prefetch #194

Implement data prefetch #194

jpsamaroo commented Jan 21, 2021

Implement data prefetch #194

Implement data prefetch #194

Comments

jpsamaroo commented Jan 21, 2021