-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return duration of an Mp4 when possible + fix loop
option
#890
Conversation
9f8d46e
to
c1c0754
Compare
c1c0754
to
6afc4a3
Compare
} | ||
} | ||
if has_audio { | ||
offset = Duration::from_nanos(last_audio_sample_pts.load(Ordering::Relaxed)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an audio track is present use it specifically. Using lasdt video sample or last sample in general is leading to audio artifacts on the start of a new track
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just so I'm sure I understand this correctly, if we're looping an audio and video tracks, if we offset the pts of the new iteration of the loop by the last video frame pts, there are audio artifacts at the beginning of the new iteration? And it's probably caused by audio pts being more granular than video pts? I mean that there are more audio frames than video frames in a time unit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buffers between queue<->decoder and decoder-input are in number of chunks, if chunks (or frame/sample batches) do not represent similar duration then timestamps for audio and video here can differ significantly.
For mp4 (15 fps) I tested, audio processing was behind the video by over 2 seconds, because the buffer can fit a lot more video at low fps.
2 address that we would have to use channels between elements that bufferrr based on media duration instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, that makes a lot of sense.
sample_count: track.sample_count(), | ||
timescale: track.timescale(), | ||
track_id, | ||
duration: track.duration(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a duration on the reader, not sure if this is something we should use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The duration of the reader is calculated from a part of the mp4, which contains "overall information which is media-independent, and relevant to the entire presentation considered as a whole", and the duration of the track is specific to the track. Not sure which one we should use either. Maybe the reader duration should theoretically be used when looping the mp4, but I don't think it's worth changing the looping code to use it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
} | ||
} | ||
if has_audio { | ||
offset = Duration::from_nanos(last_audio_sample_pts.load(Ordering::Relaxed)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just so I'm sure I understand this correctly, if we're looping an audio and video tracks, if we offset the pts of the new iteration of the loop by the last video frame pts, there are audio artifacts at the beginning of the new iteration? And it's probably caused by audio pts being more granular than video pts? I mean that there are more audio frames than video frames in a time unit?
sample_count: track.sample_count(), | ||
timescale: track.timescale(), | ||
track_id, | ||
duration: track.duration(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The duration of the reader is calculated from a part of the mp4, which contains "overall information which is media-independent, and relevant to the entire presentation considered as a whole", and the duration of the track is specific to the track. Not sure which one we should use either. Maybe the reader duration should theoretically be used when looping the mp4, but I don't think it's worth changing the looping code to use it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Old implementation would cause audio and video desync when tracks are not exactlly the same length. This implementation (if loop is provided) aborts both tracks when one of them ends and restarts them.
Return duration of a video and audio track (might be incorrect for some mp4)