Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return duration of an Mp4 when possible + fix loop option #890

Merged
merged 1 commit into from
Dec 9, 2024

Conversation

wkozyra95
Copy link
Member

@wkozyra95 wkozyra95 commented Dec 6, 2024

Old implementation would cause audio and video desync when tracks are not exactlly the same length. This implementation (if loop is provided) aborts both tracks when one of them ends and restarts them.

Return duration of a video and audio track (might be incorrect for some mp4)

@wkozyra95 wkozyra95 force-pushed the @wkozyra95/measure-mp4-length branch from 9f8d46e to c1c0754 Compare December 6, 2024 10:56
@wkozyra95 wkozyra95 self-assigned this Dec 6, 2024
@wkozyra95 wkozyra95 force-pushed the @wkozyra95/measure-mp4-length branch from c1c0754 to 6afc4a3 Compare December 6, 2024 11:24
@wkozyra95 wkozyra95 marked this pull request as ready for review December 6, 2024 11:30
}
}
if has_audio {
offset = Duration::from_nanos(last_audio_sample_pts.load(Ordering::Relaxed));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an audio track is present use it specifically. Using lasdt video sample or last sample in general is leading to audio artifacts on the start of a new track

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just so I'm sure I understand this correctly, if we're looping an audio and video tracks, if we offset the pts of the new iteration of the loop by the last video frame pts, there are audio artifacts at the beginning of the new iteration? And it's probably caused by audio pts being more granular than video pts? I mean that there are more audio frames than video frames in a time unit?

Copy link
Member Author

@wkozyra95 wkozyra95 Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buffers between queue<->decoder and decoder-input are in number of chunks, if chunks (or frame/sample batches) do not represent similar duration then timestamps for audio and video here can differ significantly.

For mp4 (15 fps) I tested, audio processing was behind the video by over 2 seconds, because the buffer can fit a lot more video at low fps.

2 address that we would have to use channels between elements that bufferrr based on media duration instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, that makes a lot of sense.

sample_count: track.sample_count(),
timescale: track.timescale(),
track_id,
duration: track.duration(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a duration on the reader, not sure if this is something we should use

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duration of the reader is calculated from a part of the mp4, which contains "overall information which is media-independent, and relevant to the entire presentation considered as a whole", and the duration of the track is specific to the track. Not sure which one we should use either. Maybe the reader duration should theoretically be used when looping the mp4, but I don't think it's worth changing the looping code to use it instead.

Copy link
Collaborator

@jerzywilczek jerzywilczek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

}
}
if has_audio {
offset = Duration::from_nanos(last_audio_sample_pts.load(Ordering::Relaxed));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just so I'm sure I understand this correctly, if we're looping an audio and video tracks, if we offset the pts of the new iteration of the loop by the last video frame pts, there are audio artifacts at the beginning of the new iteration? And it's probably caused by audio pts being more granular than video pts? I mean that there are more audio frames than video frames in a time unit?

sample_count: track.sample_count(),
timescale: track.timescale(),
track_id,
duration: track.duration(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duration of the reader is calculated from a part of the mp4, which contains "overall information which is media-independent, and relevant to the entire presentation considered as a whole", and the duration of the track is specific to the track. Not sure which one we should use either. Maybe the reader duration should theoretically be used when looping the mp4, but I don't think it's worth changing the looping code to use it instead.

@wkozyra95 wkozyra95 merged commit a6d9b7d into master Dec 9, 2024
5 checks passed
@wkozyra95 wkozyra95 deleted the @wkozyra95/measure-mp4-length branch December 9, 2024 14:00
Copy link
Member

@WojciechBarczynski WojciechBarczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants