Skip to content

Latest commit

 

History

History
278 lines (225 loc) · 16.7 KB

README.md

File metadata and controls

278 lines (225 loc) · 16.7 KB

Glif-Protofire-logo

filecoin-iac


Glif Infrastructure as Code are managed by Protofire.

Summary

This repository provides Infrastructure as Code (IaC) resources to configure and deploy the Glif.io infrastructure.

Installation

To deploy the infrastructure, do the following:

  1. NB! One-time operation during the first installation. In the init_configuration folder:
    1. Run terraform init
    2. Run terraform apply -var-file=tfvars/filecoin-glif-dev-apn1.tfvars. That will create the backed S3 and DynamoDB tables for dev environment.
    3. Run terraform apply -var-file=tfvars/filecoin-glif-mainnet-apn1.tfvars. That will create the backed S3 and DynamoDB tables for mainnet environment.
  2. In the aws folder:
    1. Run terraform init -backend-config=../filecoin-glif-dev-apn1.hcl to initialize the module with the backend configured in the previous step.
    2. Run terraform workspace select filecoin-glif-dev-apn1 to select the dev workspace.
    3. Plan and apply the configuration with -var-file=tfvars/filecoin-glif-dev-apn1.tfvars.
    4. Repeat those steps for filecoin-glif-mainnet-apn1 workspace.
  3. In the k8s folder repeat the same actions as in step 2, but the workspace names would be filecoin-dev-apn1-glif-eks and filecoin-mainnet-apn1-glif-eks respectively.
  4. In the user_management folder initialize the module with -backend-config=../filecoin-glif-dev-apn1.hcl only and select default workspace. Then you can apply the config with -var-file=tfvars/filecoin-users.tfvars.

Structure

To help you better navigate through the repository, here's a quck guide.

Architecture

The most high-level view on the infrastructure is shown on the image below.

C4 Software System

More detailed view on how the infrastructure works is shown on the image below.

C4 Containers

API

API Endpoints

Requests to Internal Load Balancer are redirectet either to dev-internal.dev.node.glif.io or mainnet-internal.node.glif.io depending on the environment. What kind of services are listening to such requests is described in ingress rules, refer to ingress rules.

Endpoint Destination
GET / https://node.glif.link
OPTIONS / Mock Endpoint
POST / Internal Load Balancer
GET /dilutedsupply https://circulatingsupply.s3.amazonaws.com/diluted_supply.html
GET /rpc/v0 https://node.glif.link
POST /rpc/v0 Internal Load Balancer
OPTIONS /rpc/v0 Mock Endpoint
GET /rpc/v1 https://node.glif.link
POST /rpc/v1 Internal Load Balancer
OPTIONS /rpc/v1 Mock Endpoint
GET /statecirculatingsupply Internal Load Balancer
GET /statecirculatingsupply/fil https://circulatingsupply.s3.amazonaws.com/index.html
GET /statecirculatingsupply/fil/v2 https://circulatingsupply-staging.s3.amazonaws.com/index.html
GET /vmcirculatingsupply Internal Load Balancer
ANY /{_next+} https://node.glif.link/{_next}

API Stages

Here's a short list of domain names that are pointing to API Gateways.

Domain Name API Gateway Stage Name
calibrationnet.glif.host dev calibration
mainnet.glif.host mainnet mainnet
api.node.glif.io mainnet mainnet
api.calibration.node.glif.io dev calibration
api.dev.node.glif.io dev dev
wallaby.node.glif.io dev wallaby

Ingress rules

Development

Hostname Path Match Type Load Balancer Namespace Service
wallaby.filecoin.tools / Prefix external default cid-checker-wallaby-frontend:80
wallaby.filecoin.tools /api/(.*) Exact external default cid-checker-wallaby-backend:3000
wallaby.filecoin.tools /docs Exact external default cid-checker-wallaby-backend:3000
wallaby.filecoin.tools /docs/(.*) Exact external default cid-checker-wallaby-backend:3000
cid.wallaby.filecoin.tools / Prefix external default cid-checker-wallaby-frontend:80
cid.wallaby.filecoin.tools /api/(.*) Exact external default cid-checker-wallaby-backend:3000
cid.wallaby.filecoin.tools /docs Exact external default cid-checker-wallaby-backend:3000
cid.wallaby.filecoin.tools /docs/(.*) Exact external default cid-checker-wallaby-backend:3000
calibration.filecoin.tools / Prefix external default cid-checker-calibration-frontend:80
calibration.filecoin.tools /api/(.*) Exact external default cid-checker-calibration-backend:3000
calibration.filecoin.tools /docs Exact external default cid-checker-calibration-backend:3000
calibration.filecoin.tools /docs/(.*) Exact external default cid-checker-calibration-backend:3000
cid.calibration.filecoin.tools / Prefix external default cid-checker-calibration-frontend:80
cid.calibration.filecoin.tools /api/(.*) Exact external default cid-checker-calibration-backend:3000
cid.calibration.filecoin.tools /docs Exact external default cid-checker-calibration-backend:3000
cid.calibration.filecoin.tools /docs/(.*) Exact external default cid-checker-calibration-backend:3000
calibration.node.glif.io /archive/lotus/(.*) Exact external network calibrationapi-archive-node-lotus:1234
calibration.node.glif.io /calibrationapi/ipfs/(.*) Exact external network calibrationapi-ipfs:4001
calibration.node.glif.io /calibrationapi/ipfs/(.*) Exact external network calibrationapi-ipfs:8080
wallaby.dev.node.glif.io /archive/lotus/(.*) Exact external network wallaby-archive-lotus:1234
wss.dev.node.glif.io /apigw/lotus/(.*) Exact external network calibrationapi-lotus
dev-internal.dev.node.glif.io /calibrationapi/lotus/(.*) Exact internal network calibrationapi-lotus:1234
dev-internal.dev.node.glif.io /api-read-dev/cache/(.*) Exact internal network api-read-cache-dev:8080
dev-internal.dev.node.glif.io /wallaby/lotus/(.*) Exact internal network wallaby-archive-lotus:1234
wss.wallaby.node.glif.io /apigw/wallaby/(.*) Exact external network wallaby-archive-lotus:2346
monitoring.dev.node.glif.io / Prefix external monitoring monitoring-grafana:80

Production

Hostname Path Match Type Load Balancer Namespace Service
filecoin.tools / Prefix external default cid-checker-mainnet-frontend:80
filecoin.tools /api/(.*) Exact external default cid-checker-mainnet-backend:3000
filecoin.tools /docs Exact external default cid-checker-mainnet-backend:3000
filecoin.tools /docs/(.*) Exact external default cid-checker-mainnet-backend:3000
cid.filecoin.tools / Prefix external default cid-checker-mainnet-frontend:80
cid.filecoin.tools /api/(.*) Exact external default cid-checker-mainnet-backend:3000
cid.filecoin.tools /docs Exact external default cid-checker-mainnet-backend:3000
cid.filecoin.tools /docs/(.*) Exact external default cid-checker-mainnet-backend:3000
node.glif.io /space06/lotus/(.*) Exact external network space06-lotus:1234
node.glif.io /space07/lotus/(.*) Exact external network space07-lotus:1234
node.glif.io /space06/cache/(.*) Exact external network space06-cache:8080
node.glif.io /space07/cache/(.*) Exact external network space07-cache:8080
wss.node.glif.io /apigw/lotus/(.*) Exact external network api-read-master-lotus:2346
mainnet-internal.node.glif.io /api-read/cache/(.*) Exact internal network api-read-v0-cache:8080
monitoring.node.glif.io / Prefix external monitoring monitoring-grafana:80

Backup policies

The current policies for the snapshots are listed here: dev, prod.

Snapshotting targets are as follows:

  • calibration-archive - create a snapshot once a week, keep the last one only.
  • space07 - create a snapshot once a day, keep the last 3 snapshots.

Support tools

Among the others, the repository uses:

Monitoring

Uptibe Robot

Uptime Robot is used to monitor web-accessible endpoints.

Summary on the uptime is available here:

Endpoint Method Payload
https://api.node.glif.io/dilutedsupply GET
https://api.node.glif.io/statecirculatingsupply GET
https://api.node.glif.io/statecirculatingsupply/fil GET
https://api.node.glif.io/vmcirculatingsupply { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://api.calibration.node.glif.io POST { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://api.dev.node.glif.io POST { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 1 }
https://api.node.glif.io POST { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://filecoin.tools { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://cid.filecoin.tools { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://monitoring.dev.node.glif.io { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://node.glif.io/space06/lotus/rpc/v0 POST { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://node.glif.io/space07/lotus/rpc/v0 POST { "jsonrpc": "2.0 ", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://monitoring.node.glif.io { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }
https://wallaby.node.glif.io/rpc/v0 POST { "jsonrpc": "2.0", "method": "Filecoin.ChainHead", "params": [], "id": 5 }

Prometheus Stack

Prometheus settings are stored here: values.yaml. Grafana custom dashboards are stored in openworklabbot organization on Grafana and retrieved by id. To update a custom dashboard update revision number of the dashboard in values.yaml.

There are a total of 5 custom dashboards:

  1. lotus/Kubernetes Persistent Volumes:
    • Volume Space Usage in percentage
  2. lotus/Lotus API Endpoints – contains information on methods usage, timeouts, etc.
  3. lotus/Lotus NGINX Ingress – deprecated.
  4. lotus/Lotus Node Health – contains information on node health, including:
    • node versions
    • peer numbers per node
    • network height per node
    • duration of validation per node
  5. lotus/Lotus PubSub – counts PubSub Message pool RPC requests per node

OpenSearch

After the installation, there's a few manual operations that need to be done in order for OpenSearch to collect the logs.

  1. Go to Settings > Roles and create a role with crud cluster permissions and crud and create_indexes index permissions.
  2. Map the role a user, as a username use role ARN of fluenbit.
  3. Check if indexes are created in Stack Management > Indeces.

Atlantis

(Credentials for Atlantis put to Bitwarden)

workflows project :
for dev environment k8s-dev; aws-dev; users
for mainnet environment k8s; aws

  • the plan doesn't start automatically on the new PR for the run plan should set the parameter {project}
  atlantis plan -p k8s-dev
  • the apply doesn't start automatically. Only after approval for the run apply should set the parameter {project}
atlantis apply -p k8s-dev