You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I tried Chapter 8 with my own set of data (some different youtube videos) and the data set yield a bug (index list out of range) in the transcript_enrich_bucket.py. Please see the stack trace below.
I also have a fix for this bug, please see the MR.
Describe the bug
I tried Chapter 8 with my own set of data (some different youtube videos) and the data set yield a bug (index list out of range) in the transcript_enrich_bucket.py. Please see the stack trace below.
I also have a fix for this bug, please see the MR.
To Reproduce
Steps to reproduce the behavior:
Stacktrace:
(.venv) PS C:\work\microsoft\generative-ai-for-beginners\08-building-search-applications\scripts> python transcript_enrich_bucket.py --verbose -f $TRANSCRIPT_FOLDER -m $TRANSCRIPT_BUCKET_MINUTES
DEBUG:main:Transcription folder: transcripts_sick
DEBUG:main:Segment length 3 minutes
Enriching Buckets... ---------------------------------------- 0% -:--:--DEBUG:main:Processing file: transcripts_sick-7ckbQAqhe4.json.vtt
Enriching Buckets... ---------------------------------------- 0% -:--:--
Traceback (most recent call last):
File "C:\work\microsoft\generative-ai-for-beginners\08-building-search-applications\scripts\transcript_enrich_bucket.py", line 218, in
get_transcript(meta)
File "C:\work\microsoft\generative-ai-for-beginners\08-building-search-applications\scripts\transcript_enrich_bucket.py", line 203, in get_transcript
parse_json_vtt_transcript(vtt, metadata)
File "C:\work\microsoft\generative-ai-for-beginners\08-building-search-applications\scripts\transcript_enrich_bucket.py", line 175, in parse_json_vtt_transcript
previous_segment_tokens = len(tokenizer.encode(segments[-1]["text"]))
~~~~~~~~^^^^
IndexError: list index out of range
Expected behavior
the transcript_enrich_bucket.py should not fail.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: