You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)
#3
Keep in mind the guidelines at the top of #1. In particular, the one about ultimately needing to use Terraform for deployment. And maybe the one about Whisper configuration, if considering an implementation that would make re-configuration difficult.
Setting the entry point to an invocation script can indeed control how it runs, so e.g. invocation script takes params (not exactly sure how to pass to ECS task, but probably env vars?)
Whisper is designed to run this way from CLI, so should be a good fit for this approach? (also usable as a library, but primarily for CLI)
What runs the container? Most likely possibility at first glance seems like AWS ECS (Elastic Container Service, for auto-scaling container based services on demand)? Another option might be to spin up and manage AWS EC2 instances more directly. Would this provide any advantage over ECS, given that it's likely more development and management work? Is it more platform agnostic (i.e. does ECS have a GCP counterpart?).
How do we scale/limit the number of containers running at once? This question would seem to nudge us towards ECS or similar.
Should the docker container always work on one file at a time, or should it process the whole current backlog, or ?? Or work on a list of file names?
Asking because we’re not sure if there’s efficiency or cost savings to be had by reducing the number of startups/teardowns of our speech-to-text generation container
Startup time for container is not negligible. Not huge, but noticeable. Will starting the container for each item in a thousand item backlog add hours or days to the processing time for that batch?
Could split the difference, and have the container work off whatever backlog, or maybe just the one next thing, whenever it finishes the file for which it was invoked.
The text was updated successfully, but these errors were encountered:
jmartin-sul
changed the title
Possible approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)
transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)
Sep 12, 2024
jmartin-sul
changed the title
transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)
[investigate/prototype] transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)
Sep 12, 2024
jmartin-sul
changed the title
[investigate/prototype] transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)
[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)
Sep 13, 2024
Initial work on this used S3 as a queue. However after further discussion we decided that it might be best to actually use an AWS queue like SQS (or SNS) since it would translate better to other cloud providers, and should make the integration with the workflow system a bit more comprehensible. I'm going to rework #9 to listen to a queue instead of looking at S3.
Keep in mind the guidelines at the top of #1. In particular, the one about ultimately needing to use Terraform for deployment. And maybe the one about Whisper configuration, if considering an implementation that would make re-configuration difficult.
Note: some naming might change, depending on terminology decisions, see https://github.com/orgs/sul-dlss/projects/65/views/1?pane=issue&itemId=79627337
The text was updated successfully, but these errors were encountered: