[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) #3

jmartin-sul · 2024-09-12T02:13:48Z

Keep in mind the guidelines at the top of #1. In particular, the one about ultimately needing to use Terraform for deployment. And maybe the one about Whisper configuration, if considering an implementation that would make re-configuration difficult.

Note: some naming might change, depending on terminology decisions, see https://github.com/orgs/sul-dlss/projects/65/views/1?pane=issue&itemId=79627337

If docker entry point is something that gets some args and runs and exits, container disappears on its own?
dlme-transform traject example: https://github.com/sul-dlss/dlme-transform/blob/main/Dockerfile
Setting the entry point to an invocation script can indeed control how it runs, so e.g. invocation script takes params (not exactly sure how to pass to ECS task, but probably env vars?)
Whisper is designed to run this way from CLI, so should be a good fit for this approach? (also usable as a library, but primarily for CLI)
What runs the container? Most likely possibility at first glance seems like AWS ECS (Elastic Container Service, for auto-scaling container based services on demand)? Another option might be to spin up and manage AWS EC2 instances more directly. Would this provide any advantage over ECS, given that it's likely more development and management work? Is it more platform agnostic (i.e. does ECS have a GCP counterpart?).
How do we scale/limit the number of containers running at once? This question would seem to nudge us towards ECS or similar.
Should the docker container always work on one file at a time, or should it process the whole current backlog, or ?? Or work on a list of file names?
- Asking because we’re not sure if there’s efficiency or cost savings to be had by reducing the number of startups/teardowns of our speech-to-text generation container
- Startup time for container is not negligible. Not huge, but noticeable. Will starting the container for each item in a thousand item backlog add hours or days to the processing time for that batch?
- Could split the difference, and have the container work off whatever backlog, or maybe just the one next thing, whenever it finishes the file for which it was invoked.

edsu · 2024-09-16T12:37:15Z

I started work on this but am not able to grab the ticket since the permissions seem to prevent me doing that?

edsu · 2024-09-23T14:27:24Z

Initial work on this used S3 as a queue. However after further discussion we decided that it might be best to actually use an AWS queue like SQS (or SNS) since it would translate better to other cloud providers, and should make the integration with the workflow system a bit more comprehensible. I'm going to rework #9 to listen to a queue instead of looking at S3.

jmartin-sul mentioned this issue Sep 12, 2024

[EPIC] Prototype workflow for generating and accessioning speech-to-text extraction #1

Open

37 tasks

edsu self-assigned this Sep 16, 2024

edsu mentioned this issue Sep 16, 2024

Add initial Docker container #9

Merged

This was referenced Sep 27, 2024

TODO job should just include ID #20

Closed

DONE message should include output file #21

Closed

edsu closed this as completed in #9 Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) #3

[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) #3

jmartin-sul commented Sep 12, 2024 •

edited

Loading

edsu commented Sep 16, 2024

edsu commented Sep 23, 2024

[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) #3

[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) #3

Comments

jmartin-sul commented Sep 12, 2024 • edited Loading

edsu commented Sep 16, 2024

edsu commented Sep 23, 2024

jmartin-sul commented Sep 12, 2024 •

edited

Loading