Skip to content

[Glow Runtime] Top Level Task #2045

Open
@qcolombet

Description

@qcolombet

This is the top level issue to track all the work we plan to do to make the glow runtime supports concurrent execution, pipelining, batching and so on.

At a high level, the idea for the runtime is to be able to:

  • Enqueue inputs: Run input0, then run input1 as soon as the previous run is done, etc.
  • Slice the inputs into batch size and transparently run them: Take N input and sequentially run them in batches of M (where M is the size of the compiled model and N the actual run size.)
  • Pipeline work across models: Run input1 on model M1, then run the result of M1 on M2 while running input2 on M1, etc.

Among other things, the glow runtime will have to:

  • Manage input/output queues for each model (and communication with the devices)
  • Manage incoming model
  • Keep track of data dependencies and schedule next tasks to be done
  • Split inputs
  • Pad inputs
  • Dispatch workload on device
  • Keep track of the status of devices
    Also, somewhat orthogonal to the runtime, but related, glow will need to:
  • Determine what and where to run things (graph partitioning)

Right now, we started by splitting the compilation and runtime stages properly.
This work is tracked in:
#2040, #1967, #1953, #1951

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions