Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) #3

Closed
7 tasks
Tracked by #1 ...
jmartin-sul opened this issue Sep 12, 2024 · 2 comments · Fixed by #9
Assignees

Comments

@jmartin-sul
Copy link
Member

jmartin-sul commented Sep 12, 2024

Keep in mind the guidelines at the top of #1. In particular, the one about ultimately needing to use Terraform for deployment. And maybe the one about Whisper configuration, if considering an implementation that would make re-configuration difficult.

Note: some naming might change, depending on terminology decisions, see https://github.com/orgs/sul-dlss/projects/65/views/1?pane=issue&itemId=79627337

  • If docker entry point is something that gets some args and runs and exits, container disappears on its own?
  • dlme-transform traject example: https://github.com/sul-dlss/dlme-transform/blob/main/Dockerfile
  • Setting the entry point to an invocation script can indeed control how it runs, so e.g. invocation script takes params (not exactly sure how to pass to ECS task, but probably env vars?)
  • Whisper is designed to run this way from CLI, so should be a good fit for this approach? (also usable as a library, but primarily for CLI)
  • What runs the container? Most likely possibility at first glance seems like AWS ECS (Elastic Container Service, for auto-scaling container based services on demand)? Another option might be to spin up and manage AWS EC2 instances more directly. Would this provide any advantage over ECS, given that it's likely more development and management work? Is it more platform agnostic (i.e. does ECS have a GCP counterpart?).
  • How do we scale/limit the number of containers running at once? This question would seem to nudge us towards ECS or similar.
  • Should the docker container always work on one file at a time, or should it process the whole current backlog, or ?? Or work on a list of file names?
    • Asking because we’re not sure if there’s efficiency or cost savings to be had by reducing the number of startups/teardowns of our speech-to-text generation container
    • Startup time for container is not negligible. Not huge, but noticeable. Will starting the container for each item in a thousand item backlog add hours or days to the processing time for that batch?
    • Could split the difference, and have the container work off whatever backlog, or maybe just the one next thing, whenever it finishes the file for which it was invoked.
@jmartin-sul jmartin-sul changed the title Possible approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) Sep 12, 2024
@jmartin-sul jmartin-sul changed the title transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) [investigate/prototype] transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) Sep 12, 2024
@jmartin-sul jmartin-sul changed the title [investigate/prototype] transcript_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) [investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?) Sep 13, 2024
@edsu
Copy link
Contributor

edsu commented Sep 16, 2024

I started work on this but am not able to grab the ticket since the permissions seem to prevent me doing that?

@edsu edsu self-assigned this Sep 16, 2024
@edsu
Copy link
Contributor

edsu commented Sep 23, 2024

Initial work on this used S3 as a queue. However after further discussion we decided that it might be best to actually use an AWS queue like SQS (or SNS) since it would translate better to other cloud providers, and should make the integration with the workflow system a bit more comprehensible. I'm going to rework #9 to listen to a queue instead of looking at S3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants