-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ffmpeg integration #1986
Add ffmpeg integration #1986
Conversation
47c3fe2
to
7ba62b2
Compare
cf64494
to
5f8095f
Compare
cc306c5
to
7a66c5e
Compare
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
738764e
to
6d8e701
Compare
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
f87d647
to
bb7361f
Compare
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This PR adds prototype streaming API based on ffmpeg. Currently, the Python surface is minimum implementation. It needs refinement. - [x] Handles audio - [x] Can change the sampling rate on the fly - [x] Can change the sample format on the fly. - [x] (Implemented and tested) `uint8`, `int16`, `int32`, and `float32` - [ ] (Implemented but not tested) `int64` and `float64`. - [ ] Can change the resampling algorithm - [x] Handles video - [x] Can change the image resolution on the fly. - [ ] ~Can change the rescaling algorithm.~ Covered via custom filter - [x] Can change the frame rate on the fly. - [ ] ~Can change the algorithm for changing the frame rate.~ Covered via custom filter - [x] Handles multiple input streams. - [x] Handles multiple output streams. - [x] Can duplicate input streams for different output configurations. - [x] avdevice integration - [x] Seek mechanism - [ ] Figure out the way to build / package. - [x] Split the binary `libtorchaudio` and `libtorchaudio_ffmpeg`. - [x] If binding dynamically, figure out the best way to detect installed ffmpeg across platforms. (`pkg-config` seems straightforward.) - [ ] If binding statically, figure out which dependencies to include. - [ ] Building and shipping all the dependencies of ffmpeg (such as `gnutls`, `x264`, `x265` etc requires a lot of efforts) - [ ] Add API similar to existing `torchaudio.load` function for simple use cases. - [x] Add the function definition - [ ] Add the parameters Statically binding ffmpeg increases the binary size from 3.6M to 23M. ``` -rwxr-xr-x 1 moto staff 3.6M Nov 5 00:43 torchaudio/lib/libtorchaudio.so -rw-r--r-- 1 moto staff 23M Nov 5 04:23 torchaudio/lib/libtorchaudio.so ```
…#2041) Summary: Part of pytorch#1986. Splitting the PR for easier review. Add wrapper classes that auto release memories allocated by ffmpeg libraries. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. - [x] Needs to be imported after updating TARGETS file. Pull Request resolved: pytorch#2041 Reviewed By: carolineechen Differential Revision: D32688964 Pulled By: mthrok fbshipit-source-id: 165bef5b292dbedae4e9599d53fb2a3f06978db8
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add `Decoder` class that manages `AVCodecContext` resource and process input `AVPacket`. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2041. Pull Request resolved: pytorch#2042 Reviewed By: carolineechen Differential Revision: D32933294 Pulled By: mthrok fbshipit-source-id: e443debadb44d491462fb641cd5b7b20c413b5b9
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add FilterGraph class that is responsible for handling AVFilterGraph structure and the application of filters. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2042. Pull Request resolved: pytorch#2043 Reviewed By: carolineechen Differential Revision: D32940535 Pulled By: mthrok fbshipit-source-id: 231e3ad17df2d67b6c7b323e5c89e718a3d48d0d
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add Buffer class that is responsible for converting `AVFrame` to `Tensor`. Note: The API to retrieve the buffered Tensors is tentative. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2043. Pull Request resolved: pytorch#2044 Reviewed By: carolineechen Differential Revision: D32940553 Pulled By: mthrok fbshipit-source-id: 8b8b2222ad7b47edc17e9139420e8a71c00d726a
Summary: Add Sink class that bundles FilterGraph and Buffer. Part of pytorch#1986. Splitting the PR for easier review. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Pull Request resolved: pytorch#2111 Reviewed By: carolineechen Differential Revision: D33350388 Pulled By: mthrok fbshipit-source-id: 8f42c5fe4be39ef2432c51fc0d0ac72ba3f06a26
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add StreamProcessor class that bundles `Buffer`, `FilterGraph` and `Decoder`. Note: The API to retrieve the buffered Tensors is tentative. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2044. Pull Request resolved: pytorch#2045 Reviewed By: carolineechen Differential Revision: D33299858 Pulled By: mthrok fbshipit-source-id: d85bececed475f45622743f137dd59cb1390ceed
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add `Streamer` class that bundles `StreamProcessor` and handle input. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2045. Pull Request resolved: pytorch#2046 Reviewed By: carolineechen Differential Revision: D33299863 Pulled By: mthrok fbshipit-source-id: 6470cbe061057c8cb970ce7bb5692be04efb5fe9
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add `Streamer` TorchBind. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2046. Pull Request resolved: pytorch#2047 Reviewed By: hwangjeff Differential Revision: D33355190 Pulled By: mthrok fbshipit-source-id: a3ad4c2822ed3a7ddc19b1aaca9dddabd59ce2f8
This PR adds prototype streaming API based on ffmpeg.
Currently, the Python surface is minimum implementation.
It needs refinement.
Handles audio
uint8
,int16
,int32
, andfloat32
int64
andfloat64
.Can change the resampling algorithmCovered via custom filterHandles video
Can change the rescaling algorithm.Covered via custom filterCan change the algorithm for changing the frame rate.Covered via custom filterHandles multiple input streams.
Handles multiple output streams.
avdevice integration
Seek mechanism
Figure out the way to build / package.
libtorchaudio
andlibtorchaudio_ffmpeg
.pkg-config
seems straightforward.)Add API similar to existing
torchaudio.load
function for simple use cases.