generated from mintlify/starter
-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation for running Arroyo on Nomad
- Loading branch information
Showing
5 changed files
with
150 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
--- | ||
title: Deploying to Kubernetes | ||
description: "Running an Arroyo cluster on Kubernetes" | ||
--- | ||
|
||
Coming soon. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
--- | ||
title: Deploying to Nomad | ||
description: "Running an Arroyo cluster on Nomad" | ||
--- | ||
|
||
Arroyo supports Nomad as both a _scheduler_ (for running Arroyo pipeline tasks) and as as a deploy target for the Arroyo | ||
control plane. This is currently the easiest way to get a production quality Arroyo cluster running. | ||
|
||
Before starting this guide, follow the common setup steps in the [deployment overview](/deployment/overview) guide. | ||
|
||
This guide assumes a working Nomad cluster. It has been tested with Nomad >= 1.4, but should work with 1.3 as well. See | ||
the [Nomad documentation](https://www.nomadproject.io/docs) for more information. | ||
|
||
Note that all of the components of Arroyo (controller, compiler, and workers) need to be able to access S3. You will | ||
need to ensure that the Nomad cluster has access to the S3 bucket you will be using. | ||
|
||
## Install nomad pack | ||
|
||
For ease of installation, we distribute a nomad pack that can be used to install Arroyo on Nomad. To use the pack, you | ||
will first need to install nomad-pack. Follow the documentation | ||
[here](https://developer.hashicorp.com/nomad/tutorials/nomad-pack/nomad-pack-intro). | ||
|
||
Once `nomad-pack` is available on your machine you are ready to proceed. | ||
|
||
## Add the Arroyo registry | ||
|
||
The Arroyo pack is available in the [Arroyo registry](https://github.com/ArroyoSystems/arroyo-nomad-pack) | ||
|
||
To add the registry, run the following command: | ||
|
||
```bash | ||
$ nomad-pack registry add arroyo \ | ||
https://github.com/ArroyoSystems/arroyo-nomad-pack.git | ||
``` | ||
|
||
## Configuring the pack | ||
|
||
There are a number of variables that can be configured to customize the Arroyo deployment: | ||
|
||
| Variable | Description | | ||
| --- | --- | | ||
| `job_name` | The name of Nomad job for the Arroyo cluster | | ||
| `region` | The region where jobs will be deployed | | ||
| `datacenters` | A list of datacenters in the region which are eligible for task placement | | ||
| `prometheus_endpoint` | Endpoint for prometheus with protocol, required for job metrics (for example `http://prometheus.service:9090`) | | ||
| `prometheus_auth` | Basic authentication for prometheus if required | | ||
| `postgres_host` | Host of your postgres database | | ||
| `postgres_port` | Port of your postgres database | | ||
| `postgres_db` | Name of your postgres database | | ||
| `postgres_user` | User of your postgres database | | ||
| `postgres_password` | Password of your postgres database | | ||
| `s3_bucket` | S3 bucket to store checkpoints and pipeline artifacts | | ||
| `s3_region` | Region for the s3 bucket | | ||
| `nomad_api` | Nomad API endpoint with protocol (for example `http://nomad.service:4646`) | | ||
| `compiler_resources` | Controls the CPU and memory to use for the compiler; at least 2 GB of memory is required | | ||
| `controller_resources` | The resources for the controller and API | | ||
|
||
Of these, at least the postgres configuration and the s3 bucket configuration are required. | ||
|
||
## Deploying the Arroyo pack | ||
|
||
Now we're ready to actually deploy our Arroyo cluster! Here's an example command line: | ||
|
||
```bash | ||
$ nomad-pack run arroyo --registry=arroyo \ | ||
--var arroyo.postgres_db=arroyo \ | ||
--var arroyo.postgres_host=postgres-host.cluster \ | ||
--var arroyo.postgres_user=arroyodb \ | ||
--var arroyo.postgres_password=arroyodb \ | ||
--var arroyo.datacenters='["us-east-1"]' \ | ||
--var arroyo.s3_bucket=arroyo-prod \ | ||
--var arroyo.prometheus_endpoint="http://prometheus.cluster:9090" | ||
``` | ||
|
||
You will need to adjust the variables as appropriate for your environment. | ||
|
||
## Accesing the Arroyo API | ||
|
||
Once the pack has been deployed, you can access the Arroyo UI by visiting the address of the `api-http` service. By | ||
default, this has a dynamic port. | ||
|
||
To find the endpoint and port, run the following command: | ||
|
||
```bash | ||
$ nomad service info api-http | ||
``` | ||
|
||
Visit the address in your browser to access the Arroyo UI. | ||
|
||
## Having trouble? | ||
|
||
Reach out to us at support@arroyo.systems or on our [Discord](https://discord.gg/cjCr5rVmyR) if you have any questions | ||
or issues. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
title: Overview | ||
description: "Running a distributed Arroyo cluster" | ||
--- | ||
|
||
While the single-node Arroyo cluster is useful for testing and development, it is not suitable for production. This | ||
page describes how to run a production-ready distributed Arroyo cluster using either Arroyo's built-in scheduler or | ||
[nomad](https://www.nomadproject.io/). | ||
|
||
Before attempting to run a cluster, you should familiarize yourself with the [Arroyo architecture](/architecture). We | ||
are also happy to support users rolling out their own clusters, so please reach out to us at support@arroyo.systems or | ||
on discord with any questions. | ||
|
||
## Common Setup | ||
|
||
### Postgres | ||
|
||
Arroyo relies on a postgres database to store configuration data and metadata. You will need to create a database | ||
(by default called `arroyo`, but this can be configured). | ||
|
||
### S3 | ||
|
||
You will need to create a S3 bucket to store checkpoints and artifacts. This will need to be writable by the nodes | ||
that are running the Arroyo controller and workers. | ||
|
||
### Prometheus | ||
|
||
The Arroyo Web UI can show job metrics to help monitor job progress. To enable this, you will need to set up a Prometheus | ||
server. See the [prometheus documentation](https://prometheus.io/docs/introduction/overview/) for more details. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters