You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create another backend for torchvision.io.write_video which uses ffmpeg-python as a backend, but which otherwise has exactly the same interface/functionality.
Motivation, pitch
torchvision.io.write_video currently calls PyAV, which in turn is a wrapper for ffmpeg. PyAV has an issue which seems still unresolved where setting the CRF (constant rate factor) through the options has no effect. This issue has been referenced as recently as March of this year. As far as I can tell, adjusting CRF is the canonical way to tune a video's level of compression. Adding support for ffmpeg-python as a backend would let users tune CRF, which would allow arbitrary levels of compression.
Alternatives
If there is some other set of options which can be passed to write_video to alter the level of compression, that would be an acceptable alternative (at least for my use-case). In this case, it would be ideal to include this alternative set of options in the write_video documentation as an example.
Additional context
I already kind of got it working in a notebook, but it's missing support for audio and such.
# Define output video parameters
output_filename = 'output_video.mp4'
fps = 30
codec = 'libx264'
# Create the input process from the NumPy array
process1 = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(video_array.shape[2], video_array.shape[1]))
.output(output_filename, pix_fmt='yuv420p', r=fps, vcodec=codec, crf=10)
.overwrite_output()
.run_async(pipe_stdin=True)
)
# Write the NumPy array to the input pipe
for frame in video_array:
process1.stdin.write(frame.tobytes())
# Close the input pipe
process1.stdin.close()
# Wait for the ffmpeg process to finish
process1.wait()
crf=10 produces something good-looking, while crf=50 produces something very compressed-looking as expected.
The text was updated successfully, but these errors were encountered:
hi @adaGrad1 , and thank you for the feature request. We'll be making a wider announcement soon, but we plan to migrate video decoding/encoding efforts away from torchvision/torchaudio, and consolidate all that within https://github.com/pytorch/torchcodec/. At this time video-encoding isn't implemented in torchcodec, but that can be in scope.
It does mean however that we won't be able to include additional video encoding capabilities to torchvision, so I'm afraid we won't be adding the ffmpeg-python backend in vision.
We'll definitely keep that crf issue in mind while working on the torchcodec encoder though
🚀 The feature
Create another backend for torchvision.io.write_video which uses ffmpeg-python as a backend, but which otherwise has exactly the same interface/functionality.
Motivation, pitch
torchvision.io.write_video currently calls PyAV, which in turn is a wrapper for ffmpeg. PyAV has an issue which seems still unresolved where setting the CRF (constant rate factor) through the options has no effect. This issue has been referenced as recently as March of this year. As far as I can tell, adjusting CRF is the canonical way to tune a video's level of compression. Adding support for ffmpeg-python as a backend would let users tune CRF, which would allow arbitrary levels of compression.
Alternatives
If there is some other set of options which can be passed to write_video to alter the level of compression, that would be an acceptable alternative (at least for my use-case). In this case, it would be ideal to include this alternative set of options in the write_video documentation as an example.
Additional context
I already kind of got it working in a notebook, but it's missing support for audio and such.
crf=10 produces something good-looking, while crf=50 produces something very compressed-looking as expected.
The text was updated successfully, but these errors were encountered: