Add documentation for running Arroyo on Nomad

ArroyoSystems · Apr 18, 2023 · f3359b9 · f3359b9
1 parent 70a2c7f
commit f3359b9
Show file tree

Hide file tree

Showing 5 changed files with 150 additions and 28 deletions.
diff --git a/deployment/arroyo-cluster.mdx → deployment/ec2.mdx b/deployment/arroyo-cluster.mdx → deployment/ec2.mdx
@@ -1,56 +1,47 @@
 ---
-title: Arroyo Cluster
-description: "Running a distributed Arroyo cluster"
+title: Deploying to EC2
+description: "Setting up an Arroyo cluster on EC2"
 ---
 
-While the single-node Arroyo cluster is useful for testing and development, it is not suitable for production. This
-page describes how to run a production-ready distributed Arroyo cluster using either Arroyo's built-in scheduler or
-[nomad](https://www.nomadproject.io/).
+This document will cover how to run an Arroyo cluster on raw EC2 instances. This requires a good understanding of the
+Arroyo architecture. For an easier approach to running a production-quality Arroyo cluster, see the docs for running
+on top of [nomad](/deployment/nomad). Kubernetes support is also coming soon.
 
-Before attempting to run a cluster, you should familiarize yourself with the [Arroyo architecture](/architecture). We
-are also happy to support users rolling out their own clusters, so please reach out to us at support@arroyo.systems with
-any questions.
+Before starting this guide, follow the common setup steps in the [deployment overview](/deployment/overview) guide.
 
-You will also need to set up a dev environment, as we do not yet distribute binaries. See the [dev setup](/dev-setup)
-instructions.
+We don't currently distribute binaries for Arroyo, so you will need to build the binaries yourself. Follow the
+[dev setup](/developing/dev-setup) guide to learn how.
 
-## Common Setup
+## Running the migrations
 
-### Postgres
-
-Arroyo relies on a postgres database to store configuration data and metadata. You will need to create a database
-(by default called `arroyo`, but this can be configured) and run the migrations to set it up.
+As covered in the dev setup, you will need to run the database migrations on your prod database before starting the
+services.
 
 We use [refinery](https://github.com/rust-db/refinery) to manage migrations. To run the migrations on your database,
 run these commands from your checkout of arroyo:
 
 ```bash
 $ cargo install refinery_cli
-$ refinery setup # follow the directions
+$ refinery setup # follow the directions, configuring for your prod database
 $ refinery migrate -p arroyo-api/migrations
 ```
 
-### S3
-
-You will need to create a S3 bucket (or an equivalent service that exposes an S3-compatible API) to store checkpoints.
-This will need to be writable by the nodes that are running the Arroyo controller and workers.
-
 ## Running the services
 
-There are two options for running aa distributed cluster. You can either use Arroyo's built-in scheduler and nodes, or
+There are two options for running a distributed cluster. You can either use Arroyo's built-in scheduler and nodes, or
 you can use [nomad](https://www.nomadproject.io/). Nomad is currently the recommended option, for production usecases.
 The Arroyo services can be run via Nomad, or separately on VMs.
 
 ### Arroyo Services
 
 An Arroyo cluster consists of more or more arroyo-api process and a single arroyo-controller process. This can be run
-however you would like, and may be run on a single machine or on multiple machines.
+however you like, and may be run on a single machine or on multiple machines. To achieve high-availability on the API
+layer, you will need to run multipe instances behind a load balancer (such as an ALB).
 
 The arroyo-api server exposes a gRPC API on port 8001 by default, and serves static HTML and JS for the web UI on port
-8000. These can be put behind a load balancer (such as an ALB) for high availability. If the API and controller are not
-running on the same machine, the API needs to be configured with the endpoint of the controller's gRPC API via the
-`CONTROLLER_ADDR` environment variable. By default, the controller runs its gRPC API on port 9190. If the controller's
-hostname is `arroyo-controller.int` then the API would be configured with
+8000. If the API and controller are not running on the same machine, the API needs to be configured with the endpoint of
+the controller's gRPC API via the `CONTROLLER_ADDR` environment variable. By default, the controller runs its gRPC API
+on port 9190. If the controller's hostname is `arroyo-controller.int` then the API would be configured with
 `CONTROLLER_ADDR=http://arroyo-controller.int:9190`.
 
 Both arroyo-api and arroyo-controller additionally need to be configured with the database connection information via

diff --git a/deployment/kubernetes.mdx b/deployment/kubernetes.mdx
@@ -0,0 +1,6 @@
+---
+title: Deploying to Kubernetes
+description: "Running an Arroyo cluster on Kubernetes"
+---
+
+Coming soon.
diff --git a/deployment/nomad.mdx b/deployment/nomad.mdx
@@ -0,0 +1,93 @@
+---
+title: Deploying to Nomad
+description: "Running an Arroyo cluster on Nomad"
+---
+
+Arroyo supports Nomad as both a _scheduler_ (for running Arroyo pipeline tasks) and as as a deploy target for the Arroyo
+control plane. This is currently the easiest way to get a production quality Arroyo cluster running.
+
+Before starting this guide, follow the common setup steps in the [deployment overview](/deployment/overview) guide.
+
+This guide assumes a working Nomad cluster. It has been tested with Nomad >= 1.4, but should work with 1.3 as well. See
+the [Nomad documentation](https://www.nomadproject.io/docs) for more information.
+
+Note that all of the components of Arroyo (controller, compiler, and workers) need to be able to access S3. You will
+need to ensure that the Nomad cluster has access to the S3 bucket you will be using.
+
+## Install nomad pack
+
+For ease of installation, we distribute a nomad pack that can be used to install Arroyo on Nomad. To use the pack, you
+will first need to install nomad-pack. Follow the documentation
+[here](https://developer.hashicorp.com/nomad/tutorials/nomad-pack/nomad-pack-intro).
+
+Once `nomad-pack` is available on your machine you are ready to proceed.
+
+## Add the Arroyo registry
+
+The Arroyo pack is available in the [Arroyo registry](https://github.com/ArroyoSystems/arroyo-nomad-pack)
+
+To add the registry, run the following command:
+
+```bash
+$ nomad-pack registry add arroyo \
+    https://github.com/ArroyoSystems/arroyo-nomad-pack.git
+```
+
+## Configuring the pack
+
+There are a number of variables that can be configured to customize the Arroyo deployment:
+
+| Variable | Description |
+| --- | --- |
+| `job_name` | The name of Nomad job for the Arroyo cluster |
+| `region` | The region where jobs will be deployed |
+| `datacenters` | A list of datacenters in the region which are eligible for task placement |
+| `prometheus_endpoint` | Endpoint for prometheus with protocol, required for job metrics (for example `http://prometheus.service:9090`) |
+| `prometheus_auth` | Basic authentication for prometheus if required |
+| `postgres_host` | Host of your postgres database |
+| `postgres_port` | Port of your postgres database |
+| `postgres_db` | Name of your postgres database |
+| `postgres_user` | User of your postgres database |
+| `postgres_password` | Password of your postgres database |
+| `s3_bucket` | S3 bucket to store checkpoints and pipeline artifacts |
+| `s3_region` | Region for the s3 bucket |
+| `nomad_api` | Nomad API endpoint with protocol (for example `http://nomad.service:4646`) |
+| `compiler_resources` | Controls the CPU and memory to use for the compiler; at least 2 GB of memory is required |
+| `controller_resources` | The resources for the controller and API |
+
+Of these, at least the postgres configuration and the s3 bucket configuration are required.
+
+## Deploying the Arroyo pack
+
+Now we're ready to actually deploy our Arroyo cluster! Here's an example command line:
+
+```bash
+$ nomad-pack run arroyo --registry=arroyo \
+    --var arroyo.postgres_db=arroyo \
+    --var arroyo.postgres_host=postgres-host.cluster \
+    --var arroyo.postgres_user=arroyodb \
+    --var arroyo.postgres_password=arroyodb \
+    --var arroyo.datacenters='["us-east-1"]' \
+    --var arroyo.s3_bucket=arroyo-prod \
+    --var arroyo.prometheus_endpoint="http://prometheus.cluster:9090"
+```
+
+You will need to adjust the variables as appropriate for your environment.
+
+## Accesing the Arroyo API
+
+Once the pack has been deployed, you can access the Arroyo UI by visiting the address of the `api-http` service. By
+default, this has a dynamic port.
+
+To find the endpoint and port, run the following command:
+
+```bash
+$ nomad service info api-http
+```
+
+Visit the address in your browser to access the Arroyo UI.
+
+## Having trouble?
+
+Reach out to us at support@arroyo.systems or on our [Discord](https://discord.gg/cjCr5rVmyR) if you have any questions
+or issues.
diff --git a/deployment/overview.mdx b/deployment/overview.mdx
@@ -0,0 +1,29 @@
+---
+title: Overview
+description: "Running a distributed Arroyo cluster"
+---
+
+While the single-node Arroyo cluster is useful for testing and development, it is not suitable for production. This
+page describes how to run a production-ready distributed Arroyo cluster using either Arroyo's built-in scheduler or
+[nomad](https://www.nomadproject.io/).
+
+Before attempting to run a cluster, you should familiarize yourself with the [Arroyo architecture](/architecture). We
+are also happy to support users rolling out their own clusters, so please reach out to us at support@arroyo.systems or
+on discord with any questions.
+
+## Common Setup
+
+### Postgres
+
+Arroyo relies on a postgres database to store configuration data and metadata. You will need to create a database
+(by default called `arroyo`, but this can be configured).
+
+### S3
+
+You will need to create a S3 bucket to store checkpoints and artifacts. This will need to be writable by the nodes
+that are running the Arroyo controller and workers.
+
+### Prometheus
+
+The Arroyo Web UI can show job metrics to help monitor job progress. To enable this, you will need to set up a Prometheus
+server. See the [prometheus documentation](https://prometheus.io/docs/introduction/overview/) for more details.
diff --git a/mint.json b/mint.json
@@ -51,7 +51,10 @@
     {
       "group": "Deployment",
       "pages": [
-        "deployment/arroyo-cluster"
+        "deployment/overview",
+        "deployment/ec2",
+        "deployment/nomad",
+        "deployment/kubernetes"
       ]
     },
     {