AWS EKS Setup for PCI-DSS, SOC2, HIPAA
Kubespot is AWS EKS customized to add security postures around SOC2, HIPAA, and PCI compliance. It is distributed as an open source terraform module allowing you to run it within your own AWS account without lock-in. Kubespot has been developed over a half a decade evolving with the AWS EKS distribution and before that kops. It is in use within multiple startups that have scaled from a couple founders in an apartment to billion dollar unicorns. By using Kubespot they were able to achieve the technical requirements for compliance while being able to deploy software fast.
Kubespot is a light wrapper around AWS EKS. The primary changes included in Kubespot are:
- Locked down with security groups, private subnets and other compliance related requirements.
- Locked down RDS and Elasticache if needed.
- Users have a single Load Balancer through which all requests go through to reduce costs.
- KEDA is used for scaling on event metrics such as queue sizes, user requests, CPU, memory or anything else Keda supports.
- Karpenter is used for autoscaling.
- Instance are lockdown with encryption, and a regular node cycle rate is set.
brew install kubectl kubernetes-helm awscli terraform
If the infrastructure is using the opsZero infrastructure as code template then you access the resources like the following:
Add your IAM credentials in ~/.aws/credentials
.
[profile_name]
aws_access_key_id=<>key>
aws_secret_access_key=<secret_key>
region=us-west-2
cd environments/<nameofenv>
make kubeconfig
export KUBECONFIG=./kubeconfig # add to a .zshrc
kubectl get pods
Kubespot uses Karpenter as the default autoscaler. To configure the autoscaler we need to create a file like the one below and run:
kubectl apply -f karpenter.yml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["t", "c", "m"]
- key: "kubernetes.io/arch"
operator: In
values: ["amd64"]
- key: "karpenter.k8s.aws/instance-cpu"
operator: In
values: ["1", "2", "4", "8", "16"]
- key: "karpenter.k8s.aws/instance-hypervisor"
operator: In
values: ["nitro"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
nodeClassRef:
name: default
disruption:
consolidationPolicy: WhenUnderutilized
expireAfter: 2h # 30 * 24h = 720h
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: Bottlerocket # Amazon Linux 2
role: "Karpenter-opszero" # Set the name of the cluster
subnetSelectorTerms:
- tags:
Name: opszero-public
securityGroupSelectorTerms:
- tags:
Name: eks-cluster-sg-opszero-1249901478
aws iam create-service-linked-role --aws-service-name spot.amazonaws.com
Note: PodSecurityPolicy (PSP) is deprecated and PodSecurity admission controller is the new standard. The CIS Benchmark is still using PSP. We have converted the PSP to the equivalent new standard.
Control | Recommendation | Level | Status | Description |
---|---|---|---|---|
1 | Control Plane Components | |||
2 | Control Plane Configuration | |||
2.1 | Logging | |||
2.1.1 | Enable audit logs | L1 | Active | cluster_logging is configured |
3 | Worker Nodes | |||
3.1 | Worker Node Configuration Files | |||
3.1.1 | Ensure that the kubeconfig file permissions are set to 644 or more restrictive | L1 | Won't Fix | Use NodeGroups or Fargate |
3.1.2 | Ensure that the kubelet kubeconfig file ownership is set to root:root | L1 | Won't Fix | Use NodeGroups or Fargate |
3.1.3 | Ensure that the kubelet configuration file has permissions set to 644 or more restrictive | L1 | Won't Fix | Use NodeGroups or Fargate |
3.1.4 | Ensure that the kubelet configuration file ownership is set to root:root | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2 | Kubelet | |||
3.2.1 | Ensure that the Anonymous Auth is Not Enabled | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.2 | Ensure that the --authorization-mode argument is not set to AlwaysAllow | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.3 | Ensure that a Client CA File is Configured | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.4 | Ensure that the --read-only-port is disabled | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.5 | Ensure that the --streaming-connection-idle-timeout argument is not set to 0 | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.6 | Ensure that the --protect-kernel-defaults argument is set to true | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.7 | Ensure that the --make-iptables-util-chains argument is set to true | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.8 | Ensure that the --hostname-override argument is not set | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.9 | Ensure that the --eventRecordQPS argument is set to 0 or a level which ensures appropriate event capture | L2 | Won't Fix | Use NodeGroups or Fargate |
3.2.10 | Ensure that the --rotate-certificates argument is not present or is set to true | L1 | Won't Fix | Use NodeGroups or Fargate |
3.2.11 | Ensure that the RotateKubeletServerCertificate argument is set to true | L1 | Won't Fix | Use NodeGroups or Fargate |
3.3 | Container Optimized OS | |||
3.3.1 | Prefer using a container-optimized OS when possible | L2 | Active | Bottlerocket ContainerOS is used. |
4 | Policies | |||
4.1 | RBAC and Service Accounts | |||
4.1.1 | Ensure that the cluster-admin role is only used where required | L1 | Active | Default Configuration |
4.1.2 | Minimize access to secrets | L1 | Active | iam_roles pass limited RBAC |
4.1.3 | Minimize wildcard use in Roles and ClusterRoles | L1 | Manual | terraform-kubernetes-rbac Set role |
4.1.4 | Minimize access to create pods | L1 | Manual | terraform-kubernetes-rbac Limit role with pod create |
4.1.5 | Ensure that default service accounts are not actively used | L1 | Manual | kubectl patch serviceaccount default -p $'automountServiceAccountToken: false' |
4.1.6 | Ensure that Service Account Tokens are only mounted where necessary | L1 | Active | tiphys Default set to false |
4.1.7 | Avoid use of system:masters group | L1 | Active | Must manually add users and roles to system:masters |
4.1.8 | Limit use of the Bind, Impersonate and Escalate permissions in the Kubernetes cluster | L1 | Manual | Limit users with system:masters role |
4.2 | Pod Security Policies | |||
4.2.1 | Minimize the admission of privileged containers | L1 | Active | tiphys defaultSecurityContext.allowPrivilegeEscalation=false |
4.2.2 | Minimize the admission of containers wishing to share the host process ID namespace | L1 | Active | tiphys hostPID defaults to false |
4.2.3 | Minimize the admission of containers wishing to share the host IPC namespace | L1 | Active | tiphys hostIPC defaults to false |
4.2.4 | Minimize the admission of containers wishing to share the host network namespace | L1 | Active | tiphys hostNetwork defaults to false |
4.2.5 | Minimize the admission of containers with allowPrivilegeEscalation | L1 | Active | tiphys defaultSecurityContext.allowPrivilegeEscalation=false |
4.2.6 | Minimize the admission of root containers | L2 | Active | tiphys defaultSecurityContext.[runAsNonRoot=true,runAsUser=1001] |
4.2.7 | Minimize the admission of containers with added capabilities | L1 | Active | tiphys defaultSecurityContext.allowPrivilegeEscalation=false |
4.2.8 | Minimize the admission of containers with capabilities assigned | L1 | Active | tiphys defaultSecurityContext.capabilities.drop: ALL |
4.3 | CNI Plugin | |||
4.3.1 | Ensure CNI plugin supports network policies. | L1 | Manual | calico_enabled=true |
4.3.2 | Ensure that all Namespaces have Network Policies defined | L1 | Manual | Add Network Policy manually |
4.4 | Secrets Management | |||
4.4.1 | Prefer using secrets as files over secrets as environment variables | L2 | Active | tiphys writes secrets to file |
4.4.2 | Consider external secret storage | L2 | Manual | Pull secrets using AWS Secret Manager. |
4.5 | Extensible Admission Control | |||
4.6 | General Policies | |||
4.6.1 | Create administrative boundaries between resources using namespaces | L1 | Manul | tiphys deploy on different namespace |
4.6.2 | Apply Security Context to Your Pods and Containers | L2 | Active | tiphys defaultSecurityContext is set |
4.6.3 | The default namespace should not be used | L2 | Active | tiphys select namespace |
5 | Managed services | |||
5.1 | Image Registry and Image Scanning | |||
5.1.1 | Ensure Image Vulnerability Scanning using Amazon ECR image scanning or a third party provider | L1 | Active | Example |
5.1.2 | Minimize user access to Amazon ECR | L1 | Active | terraform-aws-mrmgr |
5.1.3 | Minimize cluster access to read-only for Amazon ECR | L1 | Active | terraform-aws-mrmgr with OIDC |
5.1.4 | Minimize Container Registries to only those approved | L2 | Active | terraform-aws-mrmgr |
5.2 | Identity and Access Management (IAM) | |||
5.2.1 | Prefer using dedicated EKS Service Accounts | L1 | Active | terraform-aws-mrmgr with OIDC |
5.3 | AWS EKS Key Management Service | |||
5.3.1 | Ensure Kubernetes Secrets are encrypted using Customer Master Keys (CMKs) managed in AWS KMS | L1 | Active | |
5.4 | Cluster Networking | |||
5.4.1 | Restrict Access to the Control Plane Endpoint | L1 | Active | Set cluster_public_access_cidrs |
5.4.2 | Ensure clusters are created with Private Endpoint Enabled and Public Access Disabled | L2 | Active | Set cluster_private_access = true and cluster_public_access = false |
5.4.3 | Ensure clusters are created with Private Nodes | L1 | Active | Set enable_nat = true and set nodes_in_public_subnet = false |
5.4.4 | Ensure Network Policy is Enabled and set as appropriate | L1 | Manual | calico_enabled=true |
5.4.5 | Encrypt traffic to HTTPS load balancers with TLS certificates | L2 | Active | terraform-helm-kubespot |
5.5 | Authentication and Authorization | |||
5.5.1 | Manage Kubernetes RBAC users with AWS IAM Authenticator for Kubernetes | L2 | Active | iam_users use AWS IAM Authenticator |
5.6 | Other Cluster Configurations | |||
5.6.1 | Consider Fargate for running untrusted workloads | L1 | Active | Set the fargate_selector |
Name | Version |
---|---|
aws | n/a |
helm | n/a |
http | n/a |
kubernetes | n/a |
null | n/a |
tls | n/a |
Name | Description | Type | Default | Required |
---|---|---|---|---|
access_policies | access policies | list |
[] |
no |
alb_controller_version | The chart version of the ALB controller helm chart | string |
"1.4.4" |
no |
asg_nodes | Map of ASG node configurations | map(object({ |
{} |
no |
aws_load_balancer_controller_enabled | Enable ALB controller by default | bool |
true |
no |
calico_enabled | Whether calico add-on is installed | bool |
false |
no |
calico_version | The version of the calico helm chart | string |
"v3.26.1" |
no |
cidr_block | The CIDR block used by the VPC | string |
"10.2.0.0/16" |
no |
cidr_block_private_subnet | The CIDR block used by the private subnet | list |
[ |
no |
cidr_block_public_subnet | The CIDR block used by the private subnet | list |
[ |
no |
cloudwatch_pod_logs_enabled | Stream EKS pod logs to cloudwatch | bool |
false |
no |
cloudwatch_retention_in_days | How long to keep CloudWatch logs in days | number |
30 |
no |
cluster_authentication_mode | Desired Kubernetes authentication. API or API_AND_CONFIG_MAP | string |
"API" |
no |
cluster_encryption_config | Cluster Encryption Config Resources to encrypt, e.g. ['secrets'] | list(any) |
[ |
no |
cluster_kms_policy | Cluster Encryption Config KMS Key Resource argument - key policy | string |
null |
no |
cluster_logging | List of the desired control plane logging to enable. https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html | list |
[ |
no |
cluster_private_access | Whether the Amazon EKS private API server endpoint is enabled | bool |
true |
no |
cluster_public_access | Whether the Amazon EKS private API server endpoint is enabled | bool |
true |
no |
cluster_public_access_cidrs | List of CIDR blocks. Indicates which CIDR blocks can access the Amazon EKS public API server endpoint when enabled | list |
[ |
no |
cluster_version | Desired Kubernetes master version | string |
"1.30" |
no |
csi_enabled_namespaces | n/a | list(string) |
[] |
no |
csi_secrets_store_enabled | Specify whether the CSI driver is enabled on the EKS cluster | bool |
false |
no |
csi_secrets_store_version | The version of the CSI store helm chart | string |
"1.4.6" |
no |
efs_enabled | Specify whether the EFS is enabled on the EKS cluster | bool |
false |
no |
eips | List of Elastic IPs | list |
[] |
no |
enable_egress_only_internet_gateway | Create an egress-only Internet gateway for your VPC0 | bool |
false |
no |
enable_ipv6 | Enable an Amazon-provided IPv6 CIDR block with a /56 prefix length for the VPC | bool |
false |
no |
environment_name | Name of the environment to create AWS resources | string |
n/a | yes |
fargate_selector | Terraform object to create the EKS fargate profiles | map |
{ |
no |
iam_roles | Terraform object of the IAM roles | map |
{} |
no |
iam_users | List of IAM users | list |
[] |
no |
karpenter_ami_family | AMI family to use for the EC2 Node Class. Possible values: AL2 or Bottlerocket | string |
"Bottlerocket" |
no |
karpenter_enabled | Specify whether the karpenter is enabled | bool |
false |
no |
karpenter_version | The version of the karpenter helm chart | string |
"1.0.1" |
no |
metrics_server_version | The version of the metric server helm chart | string |
"3.11.0" |
no |
nat_enabled | Whether the NAT gateway is enabled | bool |
true |
no |
node_group_cpu_threshold | The value of the CPU threshold | string |
"70" |
no |
node_groups | Terraform object to create the EKS node groups | map |
{} |
no |
node_role_policies | A list of The ARN of the policies you want to attach | list |
[] |
no |
redis_enabled | Whether the redis cluster is enabled | bool |
false |
no |
redis_engine_version | Version number of the cache engine to be used for the cache clusters in this replication group | string |
"7.1" |
no |
redis_node_type | Instance class of the redis cluster to be used | string |
"cache.t4g.micro" |
no |
redis_num_nodes | Number of nodes for redis | number |
1 |
no |
s3_csi_bucket_names | The name of the S3 bucket for the CSI driver | list(string) |
[ |
no |
s3_csi_driver_enabled | Enable or disable the S3 CSI driver | bool |
false |
no |
sql_cluster_enabled | Whether the sql cluster is enabled | bool |
false |
no |
sql_cluster_monitoring_interval | Monitoring Interval for SQL Cluster | any |
null |
no |
sql_cluster_monitoring_role_arn | The ARN for the IAM role that permits RDS to send enhanced monitoring metrics to CloudWatch Logs | any |
null |
no |
sql_database_name | The name of the database to create when the DB instance is created | string |
"" |
no |
sql_encrypted | Specify whether the DB instance is encrypted | bool |
true |
no |
sql_engine | The name of the database engine to be used for this DB cluster | string |
"aurora-postgresql" |
no |
sql_engine_mode | The database engine mode | string |
"provisioned" |
no |
sql_engine_version | The SQL engine version to use | string |
"15.3" |
no |
sql_iam_auth_enabled | Specifies whether or not mappings of IAM accounts to database accounts is enabled | bool |
true |
no |
sql_identifier | The name of the database | string |
"" |
no |
sql_instance_allocated_storage | The allocated storage in gibibytes | number |
20 |
no |
sql_instance_class | The instance type of the RDS instance. | string |
"db.t4g.micro" |
no |
sql_instance_enabled | Whether the sql instance is enabled | bool |
false |
no |
sql_instance_engine | The database engine to use | string |
"postgres" |
no |
sql_instance_max_allocated_storage | the upper limit to which Amazon RDS can automatically scale the storage of the DB instance | number |
200 |
no |
sql_master_password | Password for the master DB user | string |
"" |
no |
sql_master_username | Username for the master DB user | string |
"" |
no |
sql_node_count | The number of instances to be used for this DB cluster | number |
0 |
no |
sql_parameter_group_name | Name of the DB parameter group to associate | string |
"" |
no |
sql_performance_insights_enabled | Specifies whether Performance Insights are enabled. Defaults to false | bool |
false |
no |
sql_rds_multi_az | Specify if the RDS instance is enabled multi-AZ | bool |
false |
no |
sql_serverless_seconds_until_auto_pause | The time, in seconds, before the DB cluster in serverless mode is paused | number |
300 |
no |
sql_skip_final_snapshot | Determines whether a final DB snapshot is created before the DB instance is deleted. | bool |
false |
no |
sql_storage_type | The allocated storage type for DB Instance | string |
"gp3" |
no |
sql_subnet_group_include_public | Include public subnets as part of the clusters subnet configuration. | bool |
false |
no |
tags | Terraform map to create custom tags for the AWS resources | map |
{} |
no |
vpc_flow_logs_enabled | Specify whether the vpc flow log is enabled | bool |
false |
no |
zones | AZs for the subnets | list |
[ |
no |
Name | Description |
---|---|
eks_cluster | n/a |
eks_cluster_oidc_provider_arn | n/a |
eks_cluster_token | n/a |
internet_gateway_id | n/a |
nat_gateway_ids | n/a |
node_role | n/a |
node_security_group_id | n/a |
private_route_table | n/a |
private_subnet_ids | n/a |
public_route_table | n/a |
public_subnet_ids | n/a |
vpc_id | n/a |
Since 2016 opsZero has been providing Kubernetes expertise to companies of all sizes on any Cloud. With a focus on AI and Compliance we can say we seen it all whether SOC2, HIPAA, PCI-DSS, ITAR, FedRAMP, CMMC we have you and your customers covered.
We provide support to organizations in the following ways:
- Modernize or Migrate to Kubernetes
- Cloud Infrastructure with Kubernetes on AWS, Azure, Google Cloud, or Bare Metal
- Building AI and Data Pipelines on Kubernetes
- Optimizing Existing Kubernetes Workloads
We do this with a high-touch support model where you:
- Get access to us on Slack, Microsoft Teams or Email
- Get 24/7 coverage of your infrastructure
- Get an accelerated migration to Kubernetes
Please schedule a call if you need support.