Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: segment_callaback logic #205

Merged
merged 1 commit into from
Feb 26, 2025
Merged

Conversation

newfla
Copy link
Contributor

@newfla newfla commented Feb 25, 2025

Starting from v0.13.0 set_segment_callback_safe_lossy and set_segment_callback_safe (due to internal changes from whisper.cpp).
The PR has been tested on single segment file audio and multiple segment audio
@tazz4843 a new crates release would be very welcome :D

@tazz4843
Copy link
Owner

Your PR comment message got cut off at the end I believe so I'm not entirely sure what this PR is for.

It still looks like upstream is calling the segment callback with n_new: https://github.com/ggerganov/whisper.cpp/blob/8a9ad7844d6e2a10cddf4b92de4089d7ac2b14a9/src/whisper.cpp#L6167

If the behaviour was broken then whoops but it looks correct here unless I'm misunderstanding.

@newfla
Copy link
Contributor Author

newfla commented Feb 26, 2025

HI @tazz4843 you're right: I've missed to copy paste my evidences.
Here are the result of running 1.12.0 and 1.14.2 (plus n_new and n_segment values dump)

1.12.0

n_new n_segment
1     1
1     2
1     3
1     4
1     5
1     6
1     7

SegmentCallbackData { segment: 0, start_timestamp: 0, end_timestamp: 500, text: " The little tales they kill are false. The door was barred, locked and bolted as well." }
SegmentCallbackData { segment: 1, start_timestamp: 500, end_timestamp: 900, text: " Right pairs are fit for a queen's table. A big, quick stain went on around the carpet." }
SegmentCallbackData { segment: 2, start_timestamp: 900, end_timestamp: 1400, text: " The kites dipped in suede that stayed aloft. The present hell is fly by a moustache to turn." }
SegmentCallbackData { segment: 3, start_timestamp: 1400, end_timestamp: 1800, text: " The room was crowded with a mild warm. The room was crowded with a wild mob." }
SegmentCallbackData { segment: 4, start_timestamp: 1800, end_timestamp: 2000, text: " This strong arm shall shield your honor." }
SegmentCallbackData { segment: 5, start_timestamp: 2000, end_timestamp: 2500, text: " She brushed when she gave her a white okey. The beetle droned and locked June's hand." }
SegmentCallbackData { segment: 6, start_timestamp: 2500, end_timestamp: 3500, text: " [BLANK_AUDIO]" }

1.14.2

n_new n_segment
0     7
1     7
2     7
3     7
4     7
5     7
6     7

SegmentCallbackData { segment: 6, start_timestamp: 2200, end_timestamp: 2500, text: " The beetle droned and locked June's hand." }
SegmentCallbackData { segment: 5, start_timestamp: 2000, end_timestamp: 2200, text: " She brushed when she gave her a white okey." }
SegmentCallbackData { segment: 6, start_timestamp: 2200, end_timestamp: 2500, text: " The beetle droned and locked June's hand." }
SegmentCallbackData { segment: 4, start_timestamp: 1800, end_timestamp: 2000, text: " This strong arm shall shield your honor." }
SegmentCallbackData { segment: 5, start_timestamp: 2000, end_timestamp: 2200, text: " She brushed when she gave her a white okey." }
SegmentCallbackData { segment: 6, start_timestamp: 2200, end_timestamp: 2500, text: " The beetle droned and locked June's hand." }
SegmentCallbackData { segment: 3, start_timestamp: 1400, end_timestamp: 1800, text: " The room was crowded with a mild warm. The room was crowded with a wild mob." }
SegmentCallbackData { segment: 4, start_timestamp: 1800, end_timestamp: 2000, text: " This strong arm shall shield your honor." }
SegmentCallbackData { segment: 5, start_timestamp: 2000, end_timestamp: 2200, text: " She brushed when she gave her a white okey." }
SegmentCallbackData { segment: 6, start_timestamp: 2200, end_timestamp: 2500, text: " The beetle droned and locked June's hand." }
SegmentCallbackData { segment: 2, start_timestamp: 900, end_timestamp: 1400, text: " The kites dipped in suede that stayed aloft. The present hell is fly by a moustache to turn." }
SegmentCallbackData { segment: 3, start_timestamp: 1400, end_timestamp: 1800, text: " The room was crowded with a mild warm. The room was crowded with a wild mob." }
SegmentCallbackData { segment: 4, start_timestamp: 1800, end_timestamp: 2000, text: " This strong arm shall shield your honor." }
SegmentCallbackData { segment: 5, start_timestamp: 2000, end_timestamp: 2200, text: " She brushed when she gave her a white okey." }
SegmentCallbackData { segment: 6, start_timestamp: 2200, end_timestamp: 2500, text: " The beetle droned and locked June's hand." }
SegmentCallbackData { segment: 1, start_timestamp: 500, end_timestamp: 900, text: " Right pairs are fit for a queen's table. A big, quick stain went on around the carpet." }
SegmentCallbackData { segment: 2, start_timestamp: 900, end_timestamp: 1400, text: " The kites dipped in suede that stayed aloft. The present hell is fly by a moustache to turn." }
SegmentCallbackData { segment: 3, start_timestamp: 1400, end_timestamp: 1800, text: " The room was crowded with a mild warm. The room was crowded with a wild mob." }
SegmentCallbackData { segment: 4, start_timestamp: 1800, end_timestamp: 2000, text: " This strong arm shall shield your honor." }
SegmentCallbackData { segment: 5, start_timestamp: 2000, end_timestamp: 2200, text: " She brushed when she gave her a white okey." }
SegmentCallbackData { segment: 6, start_timestamp: 2200, end_timestamp: 2500, text: " The beetle droned and locked June's hand." }

From what I understand n_new now indicates the id of the last segment produced by the model so we don't need to iterate over s0..n_segments

@tazz4843
Copy link
Owner

Oh yeah that makes a lot more sense. Thanks for the comment and PR then :)

@tazz4843 tazz4843 merged commit e059748 into tazz4843:master Feb 26, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants