Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speech to text - Hearings #539

Open
mvictor55 opened this issue Jun 7, 2022 · 1 comment
Open

Speech to text - Hearings #539

mvictor55 opened this issue Jun 7, 2022 · 1 comment
Assignees

Comments

@mvictor55
Copy link
Collaborator

mvictor55 commented Jun 7, 2022

We should try to get transcripts of the hearings - such as https://malegislature.gov/Events/Hearings/Detail/4292

I see two ways to do this

  1. Run our own video file/speech to text software, or
  2. somehow scrape the closed captions from the hearing video streams

I hope we can add a "hearings" section on a bill page that would allow a user to view the transcript and the hearing

Ideally, we'd be able to break down the hearing into its key parts, like is done here: https://sg001-harmony.sliq.net/00329/Harmony/en/PowerBrowser/PowerBrowserV2/20220607/4/1913#agenda_
There are a lot of legislature websites that are structured in this format - I think they're leveraging "Sliq Media - Harmony" software. https://www.sliq.com/northDakotaCaseStudy.html

From Mitre session:
19:10 in the video; again at 59:30
https://drive.google.com/file/d/1D3olHFsxxk1PZhQTfus_cja_cUrI9qeS/view?usp=share_link

UPDATE 10/24/23 --- this ticket was created a year ago. LLM tech now seems to provide much better solutions.

@alexjball
Copy link
Member

I looked into vosk, open source speech-to-text software, and tesseract, open-source OCR (image-to-text) software. Using Vosk, we can generate a subtitle track for the hearing videos, and we can OCR individual frames with Tesseract. Both are quite inaccurate though, and I'm not sure it would add value as an accessibility tool.

Google has a video intelligence API that we could use to detect text and transcribe audio in video. It provides 1000 min/free per month, then after that it would cost $0.20/min. I haven't had a chance to test it yet.

We could potentially use the auto-transcription for searching hearings. Users would search for keywords and we would return hearings and timecodes where they appear.

The ideal solution would be to get the transcription text straight from the state house.

https://alphacephei.com/vosk/
https://github.com/tesseract-ocr/tesseract

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants