sul-dlss · jmartin-sul · Jan 30, 2025 · Dec 20, 2024 · jmartin-sul · Jan 30, 2025
diff --git a/.github/workflows/deploy-prod.yml b/.github/workflows/deploy-prod.yml
@@ -0,0 +1,27 @@
+# Build and deploy a Docker image to the production AWS environment
+# when a new release has been created.
+
+name: Deploy to Production
+
+on:
+  release:
+    types:
+      published
+
+jobs:
+  deploy-prod:
+    runs-on: ubuntu-latest
+    steps:
+
+      - name: checkout
+        uses: actions/checkout@v3
+
+      - name: Build and push Docker image to production
+        env:
+          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID_PRODUCTION }}
+          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY_PRODUCTION }}
+          AWS_ECR_DOCKER_REPO: ${{ secrets.AWS_ECR_DOCKER_REPO_PRODUCTION }}
+        run: |
+          echo "production deploy not yet enabled"
+          # uncomment this when the keys are avaialable!
+          # ./deploy.sh
diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
@@ -0,0 +1,31 @@
+# Build and deploy a Docker image to development and staging AWS environments
+# when a tagged version is created during weekly dependency updates.
+
+name: Deploy
+
+on:
+  push:
+    tags:
+      - 'rel-*-*-*'
+
+jobs:
+  deploy-stage-qa:
+    runs-on: ubuntu-latest
+    steps:
+
+      - name: checkout
+        uses: actions/checkout@v3
+
+      - name: Build and push Docker image to development (qa in SDR)
+        env:
+          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID_DEVELOPMENT }}
+          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY_DEVELOPMENT }}
+          AWS_ECR_DOCKER_REPO: ${{ secrets.AWS_ECR_DOCKER_REPO_DEVELOPMENT }}
+        run: ./deploy.sh
+
+      - name: Build and push Docker image to staging
+        env:
+          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID_STAGING }}
+          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY_STAGING }}
+          AWS_ECR_DOCKER_REPO: ${{ secrets.AWS_ECR_DOCKER_REPO_STAGING }}
+        run: ./deploy.sh
diff --git a/README.md b/README.md
@@ -27,45 +27,38 @@ terraform validate
 terraform apply
 ```
 
-## Build Docker Image
+## Build and Deploy
 
-To build the container you will need to first download the pytorch models that Whisper uses. This is about 13GB of data and can take some time! The idea here is to bake the models into Docker image so they don't need to be fetched dynamically every time the container runs (which will add to the runtime). If you know you only need one size model, and want to just include that then edit the `whisper_models/urls.txt` file accordingly before running the `wget` command.
+In order to use the service you will need to build and deploy the speech-to-text Docker image to ECR where it will be picked up by Batch you can use the provided `deploy.sh` script.
 
-```shell
-wget --directory-prefix whisper_models --input-file whisper_models/urls.txt
-```
-
-Then you can build the image:
+Before running it you will need to define three environment variables using the values that Terraform has created for you, which you can inspect by running `terraform output`:
 
-```shell
-docker build --tag sul-speech-to-text .
 ```
-
-## Push Docker Image
-
-You will need to push your Docker image to the ECR repository that Terraform created. You can ask Terraform for the repository URL that it created. For example mine is:
-
-```shell
-terraform output docker_repository
-"482101366956.dkr.ecr.us-east-1.amazonaws.com/edsu-speech-to-text-qa"
+$ terraform output
+
+batch_job_definition = "arn:aws:batch:us-west-2:1234567890123:job-definition/sul-speech-to-text-qa"
+batch_job_queue = "arn:aws:batch:us-west-2:1234567890123:job-queue/sul-speech-to-text-qa"
+docker_repository = "1234567890123.dkr.ecr.us-west-2.amazonaws.com/sul-speech-to-text-qa"
+ecs_instance_role = "sul-speech-to-text-qa-ecs-instance-role"
+s3_bucket = "arn:aws:s3:::sul-speech-to-text-qa"
+sqs_done_queue = "https://sqs.us-west-2.amazonaws.com/1234567890123/sul-speech-to-text-done-qa"
+text_to_speech_access_key_id = "XXXXXXXXXXXXXX"
+text_to_speech_secret_access_key = <sensitive>
+
+$ terraform output text_to_speech_secret_access_key
+"XXXXXXXXXXXXXXXXXXXXXXXX"
 ```
 
-Tag your Docker image with the ECR URL:
+You will want to set these in your environment:
 
-```shell
-docker tag speech-to-text YOUR-ECR-URL
-```
-
-Ensure your Docker client is logged in:
-
-```shell
-aws ecr get-login-password | docker login --username AWS --password-stdin YOUR-ECR-URL
-```
+- AWS_ACCESS_KEY_ID: the `text_to_speech_access_key_id` value
+- AWS_SECRET_ACCESS_KEY: the `text_to_speech_secret_access_key`
+- AWS_ECR_DOCKER_REPO: the `docker_repository` value
 
-And then you can push the Docker image:
+Then you can run the deploy:
 
-```shell
-docker push YOUR-ECR-URL
+```bash
+$ ./deploy.sh
 ```
 
 ## Run
@@ -282,7 +275,11 @@ If you get no result, install with:
 
 `brew install ffmpeg`
 
-## Updating Docker Image
+## Continuous Integration
+
+This Github repository is set up with a Github Action that will automatically deployed tagged releases e.g. `rel-2025-01-01` to the DLSS development and staging AWS environments. When a Github release is created it will automatically be deployed to the production AWS environment.
+
+## Development Notes
 
 When updating the base Docker image, in order to prevent random segmentation faults you will want to make sure that:
 

diff --git a/deploy.sh b/deploy.sh
@@ -0,0 +1,35 @@
+#!/bin/bash
+
+# The following environment variables will need to be set in order to push the
+# new speech-to-text Docker image:
+#
+# - AWS_ACCESS_KEY_ID: the access key for the speech-to-text user
+# - AWS_SECRET_ACCESS_KEY: the secret key for the speech-to-text user
+# - AWS_ECR_DOCKER_REPO: the Elastic Compute Registry URL for the Docker repository
+#
+# The values can be obtained by running `terraform output` in the relevant portion of
+# the Terraform configuration.
+
+# Exit immediately if something doesn't work
+
+set -e
+
+# Download the Whisper large-v3 model, which is what we use by default. Building
+# the image with the model in it already will speed up processing since whisper
+# won't need to pull it dynamically.
+
+wget --timestamping --directory whisper_models https://openaipublic.azureedge.net/main/whisper/models/e5b1a55b89c1367dacf97e3e19bfd829a01529dbfdeefa8caeb59b3f1b81dadb/large-v3.pt 
+
+# Log in to ECR
+
+aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin $AWS_ECR_DOCKER_REPO
+
+# Build the image for Linux (not really needed when running in Github Actions)
+
+docker build -t speech-to-text --platform="linux/amd64" .
+
+# Tag and push the image to ECR
+
+docker tag speech-to-text $AWS_ECR_DOCKER_REPO
+
+docker push $AWS_ECR_DOCKER_REPO
diff --git a/speech_to_text.py b/speech_to_text.py
@@ -280,7 +280,7 @@ def load_whisper_model(model_name) -> whisper.model.Whisper:
 def create(media_path: Path):
     """
     Create a job for a given media file by placing the media file in S3 and then
-    creating a batch job which can be picked up ot perform transcription using
+    creating a batch job which can be picked up to perform transcription using
     boilerplate options.
     """
     job_id = str(uuid.uuid4())