Datafold AWS module

=======

Datafold AWS module

This repository provisions resources on AWS, preparing them for a deployment of the application on an EKS cluster.

About this module

Prerequisites

An AWS account, preferably a new isolated one.
Terraform >= 1.4.6
A customer contract with Datafold
- The application does not work without credentials supplied by sales
Access to our public helm-charts repository

This deployment will create the following resources:

AWS VPC
AWS subnet
AWS S3 bucket for clickhouse backups
AWS external load balancer
AWS ACM certificate, unless preregistered and provided
Three EBS volumes for local data storage
AWS RDS Postgres database
An EKS cluster
Service accounts for the EKS cluster to perform actions outside of its cluster boundary:
- Provisioning existing EBS volumes
- Updating load balancer target group to point to specific pods in the cluster
- Rescaling the nodegroup between 1-2 nodes

Negative scope

This module will not provision DNS names in your zone.

How to use this module

See the example for a potential setup, which has dependencies on our helm-charts

Create the bucket and dynamodb table for terraform state file:

Use the files in bootstrap to create a terraform state bucket and a dynamodb lock table.
Run ./run_bootstrap.sh to create them. Enter the deployment_name when the question is asked.
- The deployment_name is important. This is used for the k8s namespace and datadog unified logging tags and other places.
- Suggestion: company-datafold
Transfer the name of that bucket and table into the backend.hcl (symlinked into both infra and application)
Set the target_account_profile and region where the bucket / table are stored.
backend.hcl is only about where the terraform state file is located.

The example directory contains a single deployment example, which cleanly separates the underlying runtime infra from the application deployment into kubernetes. Some specific elements from the infra directory are copied and encrypted into the application directory.

Setting up the infrastructure:

It is easiest if you have full admin access in the target project.
Pre-create the ACM certificate you want to use on AWS and validate it in your DNS.
Pre-create a symmetric encryption key that is used to encrypt/decrypt secrets of this deployment.
- Use the alias instead of the mrk link. Put that into locals.tf
Refer to that certificate in main.tf using it's domain name: (Replace "datafold.acme.com")
Change the settings in locals.tf (the versions in infra and application are sym-linked)
- provider_region = which region you want to deploy in.
- aws_profile = The profile you want to use to issue the deployments. Targets the deployment account.
- kms_profile = Can be the same profile, unless you want the encryption key elsewhere.
- kms_key = A pre-created symmetric KMS key. It's only purpose is for encryption/decryption of deployment secrets.
- deployment_name = The name of the deployment, used in kubernetes namespace, container naming and datadog "deployment" Unified Tag)
Run terraform init -backend-config=../backend.hcl in both application and infra directory.
Our team will reach out to give you two secrets files:
- application_secrets.yaml goes into the application directory.
- infra_secrets.yaml goes into the infra directory.
- Encrypt both files with sops and call both secrets.yaml
Run terraform apply in infra directory. This should complete ok.
- Check in the console if you see the load balancer, the EKS cluster, etc.
Run terraform apply in application directory.
- Check the settings made in the main.tf file. Maybe you want to set "datadog.install" to false.
- Check with your favourite kubernetes tool if you see the namespace and several datafold pods running there.

About subnets and where they get created

The module by default deploys in two availability zones. This is because by default, the subnets for private and public CIDR ranges have a list of two cidr ranges specified.

The AZ in which things get deployed depends on which AZ's get selected and in which order. This is an alphabetical ordering. In us-east this could be as many as 6 AZ's.

What the module does is sort the AZs and then it will iteratively deploy a public / private subnet specifying it's AZ in the module. Thus:

[10.0.0.0/24] will get deployed in us-east-1a
[10.0.1.0/24] will get deployed in us-east-1b

To deploy to three AZ's, you should override the public/private subnet settings. Then it will iterate across 3 elements, but the order of the AZ's will be the same by default.

You can add an "exclusion list" to the AZ ID's. The AZ ID is not the same as the AZ name. The AZ name on AWS is shuffled between their actual location across all AWS accounts. This means that your us-east-1a might be use1-az1 for you, but it might be use1-az4 for an account elsewhere. So if you need to match AZ's, you should match Availability zone ID's, not Availability zone names. The AZ ID is visible in the EC2 screen in the "settings" screen. There you see a list of enabled AZ's, their ID and their name.

To specifically select particular AZ ID's, exclude the ones you do not want in the az_id_exclude_filter. This is a list. That way, you can restrict this to only AZ's you want. Unfortunately it is an exclude filter and not an include filter. That means if AWS adds additional AZ's, it could create replacements for a future AZ.

Good news is that when there letters in use, I'd expect those letters to be maintained per AZ ID once they exist. Just for new accounts these can be shuffled all over again. So from terraform state perspective, things should be consistent at least.

Initializing the application

The deployment is created and the initjob should have created the databases and done the initialization of the site settings.

If that didn't complete successfully, try to restart the job.

Once the deployment is complete and the initjob succeeded, we can set the install to that for false in config.yaml:

initjob:
  install: false

Alternatively, here are the manual steps to achieve the same:

Establish a shell into the <deployment>-dfshell container. It is likely that the scheduler and server containers are crashing in a loop.

All we need to is to run these commands:

./manage.py clickhouse create-tables
./manage.py database create-or-upgrade
./manage.py installation set-new-deployment-params

Now all containers should be up and running.

Requirements

Name	Version
aws	>= 4.8.0
dns	3.2.1

Providers

Name	Version
aws	>= 4.8.0
random	n/a

Modules

Name	Source	Version
clickhouse_backup	./modules/clickhouse_backup	n/a
database	./modules/database	n/a
eks	./modules/eks	n/a
load_balancer	./modules/load_balancer	n/a
networking	./modules/networking	n/a
security	./modules/security	n/a

Resources

Name	Type

Inputs

Name	Description	Type	Default	Required
alb_certificate_domain	Pass a domain name like example.com to this variable in order to enable ALB HTTPS listeners. Terraform will try to find AWS certificate that is issued and matches asked domain, so please make sure that you have issued a certificate for asked domain already.	`string`	n/a	yes
apply_major_upgrade	Sets the flag to allow AWS to apply major upgrade on the maintenance plan schedule.	`bool`	`false`	no
aws_auth_accounts	List of account maps to add to the aws-auth configmap	`list(any)`	`[]`	no
aws_auth_users	List of user maps to add to the aws-auth configmap	`list(any)`	`[]`	no
backend_app_port	The target port to use for the backend services	`number`	`80`	no
clickhouse_data_size	EBS volume size for clickhouse data in GB	`number`	`40`	no
clickhouse_logs_size	EBS volume size for clickhouse logs in GB	`number`	`40`	no
clickhouse_s3_bucket	Bucket where clickhouse backups are stored	`string`	`"clickhouse-backups-abcguo23"`	no
create_aws_auth_configmap	Whether to create the AWS authentication configmap	`bool`	`false`	no
create_rds_kms_key	Set to true to create a separate KMS key (Recommended).	`bool`	`true`	no
create_ssl_cert	Creates an SSL certificate is set.	`bool`	n/a	yes
database_name	RDS database name	`string`	`"datafold"`	no
db_instance_tags	The extra tags to be applied to the RDS instance.	`map(any)`	`{}`	no
db_parameter_group_tags	The extra tags to be applied to the parameter group	`map(any)`	`{}`	no
db_subnet_group_tags	The extra tags to be applied to the parameter group	`map(any)`	`{}`	no
default_node_disk_size	Disk size for a node in GB	`number`	`40`	no
deploy_vpc_flow_logs	Activates the VPC flow logs if set.	`bool`	`false`	no
deployment_name	Name of the current deployment.	`string`	n/a	yes
dhcp_options_domain_name	Specifies DNS name for DHCP options set	`string`	`""`	no
dhcp_options_domain_name_servers	Specify a list of DNS server addresses for DHCP options set	`list(string)`	[ "AmazonProvidedDNS" ]	no
dhcp_options_tags	Tags applied to the DHCP options set.	`map(string)`	`{}`	no
dns_egress_cidrs	List of Internet addresses to which the application has access	`list(string)`	`[]`	no
ebs_extra_tags	The extra tags to be applied to the EBS volumes	`map(any)`	`{}`	no
ebs_iops	IOPS of EBS volume	`number`	`3000`	no
ebs_throughput	Throughput of EBS volume	`number`	`1000`	no
ebs_type	Type of EBS volume	`string`	`"gp3"`	no
enable_dhcp_options	Flag to use custom DHCP options for DNS resolution.	`bool`	`false`	no
environment	Global environment tag to apply on all datadog logs, metrics, etc.	`string`	n/a	yes
host_override	Overrides the default domain name used to send links in invite emails and page links. Useful if the application is behind cloudflare for example.	`string`	`""`	no
ingress_enable_http_sg	Whether regular HTTP traffic should be allowed to access the load balancer	`bool`	`false`	no
k8s_cluster_version	Ref. https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html	`string`	`"1.29"`	no
k8s_module_version	EKS terraform module version	`string`	`"~> 19.7"`	no
lb_idle_timeout	The time in seconds that the connection is allowed to be idle.	`number`	`120`	no
lb_internal	Set to true to make the load balancer internal and not exposed to the internet.	`bool`	`false`	no
manage_aws_auth_configmap	Determines whether to manage the aws-auth configmap	`bool`	`false`	no
managed_node_grp	Ref. https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest/submodules/eks-managed-node-group	`any`	n/a	yes
managed_node_grp_default	Ref. https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt	`list(any)`	`[]`	no
nat_gateway_public_ip	Public IP of the NAT gateway when reusing the NAT gateway instead of recreating	`string`	`""`	no
private_subnet_tags	The extra tags to be applied to the private subnets	`map(any)`	`{}`	no
propagate_intra_route_tables_vgw	If intra subnets should propagate traffic.	`bool`	`false`	no
propagate_private_route_tables_vgw	If private subnets should propagate traffic.	`bool`	`false`	no
propagate_public_route_tables_vgw	If public subnets should propagate traffic.	`bool`	`false`	no
provider_azs	List of availability zones to consider. If empty, the modules will determine this dynamically.	`list(string)`	`[]`	no
provider_region	The AWS region in which the infrastructure should be deployed	`string`	n/a	yes
public_subnet_tags	The extra tags to be applied to the public subnets	`map(any)`	`{}`	no
rds_allocated_storage	The size of RDS allocated storage in GB	`number`	`20`	no
rds_backups_replication_retention_period	RDS backup replication retention period	`number`	`14`	no
rds_backups_replication_target_region	RDS backup replication target region	`string`	`null`	no
rds_extra_tags	The extra tags to be applied to the RDS instance	`map(any)`	`{}`	no
rds_instance	EC2 insance type for PostgreSQL RDS database. Available instance groups: t3, m4, m5. Available instance classes: medium and higher.	`string`	`"db.t3.medium"`	no
rds_kms_key_alias	RDS KMS key alias.	`string`	`"datafold-rds"`	no
rds_max_allocated_storage	The upper limit the database can grow in GB	`number`	`100`	no
rds_param_group_family	The DB parameter group family to use	`string`	`"postgres15"`	no
rds_port	Port the RDS database should be listening on.	`number`	`5432`	no
rds_ro_username	RDS read-only user name (not currently used).	`string`	`"datafold_ro"`	no
rds_username	Overrides the default RDS user name that is provisioned.	`string`	`"datafold"`	no
rds_version	Postgres RDS version to use.	`string`	`"15.5"`	no
redis_data_size	Redis EBS volume size in GB	`number`	`10`	no
s3_clickhouse_backup_tags	The extra tags to be applied to the S3 clickhouse backup bucket	`map(any)`	`{}`	no
self_managed_node_grp	Ref. https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest/submodules/self-managed-node-group	`any`	`{}`	no
self_managed_node_grp_default	Configuration for the self managed node group	`any`	`{}`	no
self_managed_node_grp_instance_type	Ref. https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt	`string`	`"THe instance type for the self managed node group."`	no
sg_tags	The extra tags to be applied to the security group	`map(any)`	`{}`	no
tags	Tags to apply to the general module	`any`	`{}`	no
use_default_rds_kms_key	Flag weither or not to use the default RDS KMS encryption key. Not recommended.	`bool`	`false`	no
vpc_cidr	The CIDR of the new VPC, if the vpc_cidr is not set	`string`	`"10.0.0.0/16"`	no
vpc_id	The VPC ID of an existing VPC to deploy the cluster in. Creates a new VPC if not set.	`string`	`""`	no
vpc_private_subnets	The private subnet CIDR ranges when a new VPC is created.	`list(string)`	[ "10.0.0.0/24", "10.0.1.0/24" ]	no
vpc_propagating_vgws	ID's of virtual private gateways to propagate.	`list(any)`	`[]`	no
vpc_public_subnets	The public network CIDR ranges	`list(string)`	[ "10.0.100.0/24", "10.0.101.0/24" ]	no
vpc_tags	The extra tags to be applied to the VPC	`map(any)`	`{}`	no
vpc_vpn_gateway_id	ID of the VPN gateway to attach to the VPC	`string`	`""`	no
whitelisted_egress_cidrs	List of Internet addresses the application can access going outside	`list(string)`	n/a	yes
whitelisted_ingress_cidrs	List of CIDRs that can pass through the load balancer	`list(string)`	n/a	yes

Outputs

Name	Description
clickhouse_access_key	The access key of the IAM user doing the clickhouse backups.
clickhouse_data_size	The size in GB of the clickhouse EBS data volume
clickhouse_data_volume_id	The EBS volume ID where clickhouse data will be stored.
clickhouse_logs_size	The size in GB of the clickhouse EBS logs volume
clickhouse_logs_volume_id	The EBS volume ID where clickhouse logs will be stored.
clickhouse_password	The generated clickhouse password to be used in the application deployment
clickhouse_s3_bucket	The location of the S3 bucket where clickhouse backups are stored
clickhouse_s3_region	The region where the S3 bucket is created
clickhouse_secret_key	The secret key of the IAM user doing the clickhouse backups.
cloud_provider	A string describing the type of cloud provider to be passed onto the helm charts
cluster_name	The name of the EKS cluster
cluster_scaler_role_arn	The ARN of the role that is able to scale the EKS cluster nodes.
db_instance_id	The ID of the RDS database instance
deployment_name	The name of the deployment
domain_name	The domain name to be used in DNS configuration
k8s_load_balancer_controller_role_arn	The ARN of the role provisioned so the k8s cluster can edit the target group through the AWS load balancer controller.
lb_name	The name of the external load balancer
load_balancer_ips	The load balancer IP when it was provisioned.
postgres_database_name	The name of the pre-provisioned database.
postgres_host	The DNS name for the postgres database
postgres_password	The generated postgres password to be used by the application
postgres_port	The port configured for the RDS database
postgres_username	The postgres username to be used by the application
redis_data_size	The size in GB of the Redis data volume.
redis_data_volume_id	The EBS volume ID of the Redis data volume.
redis_password	The generated redis password to be used in the application deployment
security_group_id	The security group ID managing ingress from the load balancer
target_group_arn	The ARN to the target group where the pods need to be registered as targets.
vpc_cidr	The CIDR of the entire VPC

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.github		.github
examples		examples
modules		modules
.envrc		.envrc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.releaserc.json		.releaserc.json
.terraform-docs		.terraform-docs
.terraform-version		.terraform-version
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
main.tf		main.tf
outputs.tf		outputs.tf
variables.tf		variables.tf
versions.tf		versions.tf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Datafold AWS module

About this module

Prerequisites

Negative scope

How to use this module

About subnets and where they get created

Initializing the application

Requirements

Providers

Modules

Resources

Inputs

Outputs

About

Releases 41

Packages

Contributors 4

Languages

License

datafold/terraform-aws-datafold

Folders and files

Latest commit

History

Repository files navigation

Datafold AWS module

About this module

Prerequisites

Negative scope

How to use this module

About subnets and where they get created

Initializing the application

Requirements

Providers

Modules

Resources

Inputs

Outputs

About

Resources

License

Stars

Watchers

Forks

Releases 41

Packages 0

Contributors 4

Languages

Packages