-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Buffer class #2044
Add Buffer class #2044
Conversation
314bbeb
to
fdeab57
Compare
@mthrok has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
std::deque<torch::Tensor> chunks; | ||
AVMediaType media_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add reference to the variables?
std::deque<torch::Tensor> chunks; | |
AVMediaType media_type; | |
std::deque<torch::Tensor>& chunks; | |
AVMediaType& media_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just an enum, so it's cheap and safer to copy.
case AVMEDIA_TYPE_AUDIO: | ||
push_audio_frame(frame); | ||
break; | ||
case AVMEDIA_TYPE_VIDEO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are AVMEDIA_TYPE_AUDIO
and AVMEDIA_TYPE_VIDEO
defined? Or it's taken care by ffmpeg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch::Tensor convert_audio_tensor(AVFrame* pFrame) { | ||
// ref: https://ffmpeg.org/doxygen/4.1/filter__audio_8c_source.html#l00215 | ||
AVSampleFormat format = static_cast<AVSampleFormat>(pFrame->format); | ||
int num_channels = pFrame->channels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like num_channels
can be defined when initializing the Buffer object? As each Buffer object will receive signal from one Streamer, which fixed the num_channels after initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, but unfortunately the shape information is not necessarily available at the time Buffer object is initialized, due to a filter graph operation inserted before the buffer, which can change number of audio channels, image size, image channels and rate. The structure passed around does not have the explicit information of these.
Let me figure this out later. along with the buffer size concern you brought up.
} // namespace | ||
|
||
void Buffer::push_audio_frame(AVFrame* pFrame) { | ||
chunks.push_back(convert_audio_tensor(pFrame)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to check the size of chunks
so that if it exceeds the buffer size, we might want to pop out the last chunk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let me follow-up this one. I am still figuring out a safe way to process chunks of different rates at the same time. (sample rate / frame rate)
fdeab57
to
1b44c5e
Compare
@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Part of #1986. Splitting the PR for easier review. Add StreamProcessor class that bundles `Buffer`, `FilterGraph` and `Decoder`. Note: The API to retrieve the buffered Tensors is tentative. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after #2044. Pull Request resolved: #2045 Reviewed By: carolineechen Differential Revision: D33299858 Pulled By: mthrok fbshipit-source-id: d85bececed475f45622743f137dd59cb1390ceed
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add Buffer class that is responsible for converting `AVFrame` to `Tensor`. Note: The API to retrieve the buffered Tensors is tentative. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2043. Pull Request resolved: pytorch#2044 Reviewed By: carolineechen Differential Revision: D32940553 Pulled By: mthrok fbshipit-source-id: 8b8b2222ad7b47edc17e9139420e8a71c00d726a
Summary: Part of pytorch#1986. Splitting the PR for easier review. Add StreamProcessor class that bundles `Buffer`, `FilterGraph` and `Decoder`. Note: The API to retrieve the buffered Tensors is tentative. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after pytorch#2044. Pull Request resolved: pytorch#2045 Reviewed By: carolineechen Differential Revision: D33299858 Pulled By: mthrok fbshipit-source-id: d85bececed475f45622743f137dd59cb1390ceed
Part of #1986. Splitting the PR for easier review.
Add Buffer class that is responsible for converting
AVFrame
toTensor
.Note: The API to retrieve the buffered Tensors is tentative.
For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md.
Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later.
Needs to be imported after #2043.