Allow ffmpeg-python backend for torchvision.io.write_video? #8569

adaGrad1 · 2024-08-08T01:14:07Z

🚀 The feature

Create another backend for torchvision.io.write_video which uses ffmpeg-python as a backend, but which otherwise has exactly the same interface/functionality.

Motivation, pitch

torchvision.io.write_video currently calls PyAV, which in turn is a wrapper for ffmpeg. PyAV has an issue which seems still unresolved where setting the CRF (constant rate factor) through the options has no effect. This issue has been referenced as recently as March of this year. As far as I can tell, adjusting CRF is the canonical way to tune a video's level of compression. Adding support for ffmpeg-python as a backend would let users tune CRF, which would allow arbitrary levels of compression.

Alternatives

If there is some other set of options which can be passed to write_video to alter the level of compression, that would be an acceptable alternative (at least for my use-case). In this case, it would be ideal to include this alternative set of options in the write_video documentation as an example.

Additional context

I already kind of got it working in a notebook, but it's missing support for audio and such.

# Define output video parameters
output_filename = 'output_video.mp4'
fps = 30
codec = 'libx264' 

# Create the input process from the NumPy array
process1 = (
    ffmpeg
    .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(video_array.shape[2], video_array.shape[1]))
    .output(output_filename, pix_fmt='yuv420p', r=fps, vcodec=codec, crf=10)
    .overwrite_output()
    .run_async(pipe_stdin=True)
)

# Write the NumPy array to the input pipe
for frame in video_array:
    process1.stdin.write(frame.tobytes())

# Close the input pipe
process1.stdin.close()

# Wait for the ffmpeg process to finish
process1.wait()

crf=10 produces something good-looking, while crf=50 produces something very compressed-looking as expected.

The text was updated successfully, but these errors were encountered:

NicolasHug · 2024-10-11T11:39:31Z

hi @adaGrad1 , and thank you for the feature request. We'll be making a wider announcement soon, but we plan to migrate video decoding/encoding efforts away from torchvision/torchaudio, and consolidate all that within https://github.com/pytorch/torchcodec/. At this time video-encoding isn't implemented in torchcodec, but that can be in scope.
It does mean however that we won't be able to include additional video encoding capabilities to torchvision, so I'm afraid we won't be adding the ffmpeg-python backend in vision.
We'll definitely keep that crf issue in mind while working on the torchcodec encoder though

N00bcak mentioned this issue Aug 9, 2024

Draft for better write_video documentation #8576

Merged

NicolasHug closed this as completed in #8576 Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow ffmpeg-python backend for torchvision.io.write_video? #8569

Allow ffmpeg-python backend for torchvision.io.write_video? #8569

adaGrad1 commented Aug 8, 2024

NicolasHug commented Oct 11, 2024

Allow ffmpeg-python backend for torchvision.io.write_video? #8569

Allow ffmpeg-python backend for torchvision.io.write_video? #8569

Comments

adaGrad1 commented Aug 8, 2024

🚀 The feature

Motivation, pitch

Alternatives

Additional context

NicolasHug commented Oct 11, 2024