This project uses Google Cloud Platform, with infrastructure provisioned from code with Terraform.
/backend/initmaintains infrastructure needed to set up the Terraform state bucket for the rest of the infrastructure./backend/terraformmaintains all other infrastructure.
The only non-managed resources are provided by Firebase (used for database, hosting and authentication). Steps that need to be performed:
- Run
terraform initandterraform applyin/backend/initto create the Terraform state bucket.- Note: A number of applies may be needed, as Google Cloud APIs are enabled asynchronously.
- Run
terraform initandterraform applyin/backend/terraformto create the rest of the infrastructure.- Note: The service account
terraform-appliermust be used for Firebase Auth resources, but it has limited access for security. The service account must be created first along with the rest of the resources by an admin user usingterraform apply. terraform applytakesworker_image_tagas an input, which is the tag of the Docker image to deploy to the worker agent. The image must be built and pushed to the Google Container Registry first.
- Note: The service account
- Using the Firebase CLI, run
firebase initin/frontendto set up a Firebase project. - In the Firebase Console, enable the following services for the project:
- Authentication
- Cloud Firestore
- Hosting
- Enable Google as a sign-in provider in the Firebase Console. Add desired domains to Authorized Domains.
The frontend (/frontend) is an SPA (single-page application) using React and TypeScript, with Vite for bundling. It is hosted on Firebase, which is also used for authentication.
The cloud function (/backend/cloud_function) handles user requests originating from the frontend. It creates and assigns tasks to the worker agents with GPUs, by publishing messages to the Pub/Sub topic worker-agent-requests.
The number of worker agents is fixed to the number of unacknowledged messages in this Pub/Sub topic, managed by the autoscaler.
The worker agent (/backend/worker_agent) also retrieves the task specification from the Pub/Sub topic.
The worker agent script and cloud function are both written in Python.
GitHub Actions (.github/workflows) are used for continuous delivery, deploying both frontend and backends on every merge to the main branch.
Keyless authentication with Google Cloud is achieved using a Workload Identity Pool.