Inference-specialized VideoReader #26

talmo · 2023-11-28T17:41:09Z

The classic approach of using a generic Provider as in core SLEAP is not always ideal for the "streaming inference" workload, where we're just reading through frames of a video sequentially (as opposed to doing random access in training/labels inference).

In particular, we take a big performance hit by having to re-open and/or seek to new frames when decoding a video sequentially.

A better approach that is more specialized to inference would look like a concurrent producer-consumer pattern.

In the video reader process:

import attrs
import multiprocessing as mp
import sleap_io as sio

@attrs.define
class VideoReader:
    video: sio.Video
    max_size: int = 8
    frame_queue: mp.Queue = attrs.field(init=False)

    def __attrs_post_init__(self):
        self.frame_queue = mp.Queue(maxsize=self.max_size)

    def start(self, frame_inds):
        for frame_idx in frame_inds:
            img = self.video[frame_ind]
            self.frame_queue.put({"frame_idx": frame_idx, "img": img})  # blocks if full

Then VideoReader is called in a mp.Process so it's fully concurrent. Data is shared across processes via the frame_queue.

Will also need a way to stop the reader if the whole thing is cancelled. (See docs on multiprocessing for patterns for this.)

Then in the inference process, it'll be something like:

# Build a batch.
batch = []
for i in range(batch_size):
    batch.append(video_reader.frame_queue.get())

# Process images.
imgs = np.concatenate([x["img"] for x in batch], axis=0)
predictions = predictor.predict(imgs)

This is done inside of an outer loop that loops over batches. This can be pre-determined based on the video metadata, or done greedily in a while loop, checking if there are any more frames every time.

Special case to handle: Incomplete batches (this will hang one of the processes in the current formulation -- suggestion: poison pill method).

Plan

VideoReader: This is the base video reader class, which is a sub-class of threading.Thread. The inputs to this module include sleap_io.Video object, a queue with a pre-defined max-size as batch size, and a tuple containing the start and end frame for defining the list of frames to process. If the tuple is None, then all the frames in the video are processed. This module overwrites the run method from the parent Thread class to read the frames based on the given frame indices and add the frames as a list [image, frame_idx, orig_size, video_file_name] to the buffer queue. If the reader doesn't have any more frames to load into the queue, None values are appended to terminate consuming frames from the queue.

The frames are consumed (similar to the code below) and are grouped into batches before passing it to a trained model to get the predicted instances using the Predictor class. The predicted instances can then be saved into a .slp file or can be returned as a list of predictions.

# Build a batch.
batch = []
for i in range(batch_size):
    batch.append(video_reader.frame_queue.get())

# Process images.
imgs = np.concatenate([x["img"] for x in batch], axis=0)
predictions = predictor.predict(imgs, return_labels=False)

The text was updated successfully, but these errors were encountered:

This was referenced May 2, 2024

Add VideoReader #44

Closed

Add VideoReader #45

Merged

talmo added the enhancement New feature or request label May 3, 2024

talmo linked a pull request May 7, 2024 that will close this issue

Add VideoReader #45

Merged

gitttt-1234 closed this as completed in #45 May 14, 2024

gitttt-1234 mentioned this issue Sep 27, 2024

Remove IterDataPipe from Inference pipeline #96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference-specialized VideoReader #26

Inference-specialized VideoReader #26

talmo commented Nov 28, 2023 •

edited by gitttt-1234

Loading

Inference-specialized VideoReader #26

Inference-specialized VideoReader #26

Comments

talmo commented Nov 28, 2023 • edited by gitttt-1234 Loading

Plan

talmo commented Nov 28, 2023 •

edited by gitttt-1234

Loading