Terraform IaC to deploy an AWS EKS cluster with Karpenter autoscaling and ingress support.
- ✅ Multi-AZ VPC: Public, private, and intra subnets with NAT gateway and VPC Flow Logs
- ✅ EKS Cluster: Managed Kubernetes cluster with public and private endpoints
- ✅ Karpenter Autoscaling: Node autoscaler with automatic provisioning (Linux/amd64, Nitro instances)
- ✅ EKS Addons:
- VPC CNI
- CoreDNS
- kube-proxy
- EBS CSI Driver
- Snapshot Controller
- Pod Identity Agent
- Secrets Manager
- metrics-server
- kube-state-metrics
- ✅ Ingress Support: AWS Load Balancer Controller (ALB/NLB) with optional Route53 and ACM certificate
- ✅ DNS Management: Optional External-DNS for automatic Route53 record management
- ✅ Security: IRSA, security groups, VPC Flow Logs, Bottlerocket AMI
- AWS CLI configured with appropriate permissions
- Terraform >= 1.13.0
- kubectl installed
- IAM permissions for EKS, VPC, IAM, CloudWatch
Edit terraform/terraform.prod.tfvars (or create environment-specific file):
region = "eu-north-1"
environment = "production" # dev, staging, or production
kubernetes_version = "1.34"
instance_types = ["m5.large"]
primary_min_size = 2
primary_max_size = 3
primary_desired_size = 2
create_dns_zone = false # Optional
dns_zone_name = "<dns_zone>" # Optional
is_aws_registered_domain = false # Optionalcd terraform
# Initialize Terraform
terraform init
# Review changes
terraform plan -var-file=terraform.prod.tfvars
# Deploy (takes ~15-20 minutes)
terraform apply -var-file=terraform.prod.tfvarsInspect outputs given by Terraform on successful apply:
cluster_certificate_authority_data = "..."
cluster_endpoint = "..."
cluster_name = "eks-task-production"
cluster_security_group_id = "..."
karpenter_queue_name = "..."
kubectl_config_command = "aws eks update-kubeconfig --region eu-north-1 --name eks-task-production" # <--
node_security_group_id = "..."
oidc_provider_arn = "..."
region = "eu-north-1"
vpc_id = "..."
# Configure kubectl
aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME# Check cluster is accessible
kubectl cluster-info
# Verify nodes
kubectl get nodes -o wide# Check all addons are healthy
kubectl get pods -n kube-system
# Verify specific addons
kubectl get pods -n kube-system | grep -E "ebs-csi|coredns|vpc-cni|kube-proxy|pod-identity|aws-load-balancer|external-dns"
# Expected output should show:
# - ebs-csi-controller-* (2/2 ready)
# - coredns-* (2/2 ready)
# - aws-node-* (DaemonSet, 1 per node)
# - kube-proxy-* (DaemonSet, 1 per node)
# - eks-pod-identity-agent-* (DaemonSet, 1 per node)
# - aws-load-balancer-controller-* (1/1 ready)
# - external-dns-* (1/1 ready, if DNS zone created)
# Verify snapshot addon
kubectl get pods -n aws-secrets-manager
# Expected output should show:
# aws-secrets-store-csi-driver-provider-*
# secrets-store-csi-driver-*# Check Karpenter controller is running
kubectl get pods -n kube-system | grep karpenter
# Expected: karpenter-* pod should be Running
# Check Karpenter logs for errors
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter --tail=50
# Verify Karpenter resources
kubectl get nodepool default
kubectl get ec2nodeclass default
# Check NodePool status (should be Ready)
kubectl get nodepool default -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
# Expected: "True"
# Check EC2NodeClass status
kubectl get ec2nodeclass default -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
# Expected: "True"# Create a test deployment that requires more resources than available
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-karpenter
spec:
replicas: 5
selector:
matchLabels:
app: test-karpenter
template:
metadata:
labels:
app: test-karpenter
spec:
containers:
- name: test
image: nginx
resources:
requests:
cpu: 2
memory: 4Gi
EOF
# Watch Karpenter provision nodes
watch kubectl get nodes
# After a few minutes, you should see new nodes being provisioned
# Clean up test deployment
kubectl delete deployment test-karpenter# Check security groups are tagged for Karpenter discovery
aws ec2 describe-security-groups \
--filters "Name=tag:karpenter.sh/discovery,Values=eks-task-production" \
--query 'SecurityGroups[*].[GroupId,GroupName]' \
--output table
# Should show at least the node security group# Check AWS Load Balancer Controller is running
kubectl get pods -n kube-system | grep aws-load-balancer-controller
# Check External-DNS (if DNS zone created)
kubectl get pods -n kube-system | grep external-dns
# Verify ACM certificate (if DNS zone created)
terraform output acm_certificate_arn# Check subnets are properly tagged
aws ec2 describe-subnets \
--filters "Name=tag:karpenter.sh/discovery,Values=eks-task-production" \
--query 'Subnets[*].[SubnetId,AvailabilityZone,Tags[?Key==`Name`].Value|[0]]' \
--output table
# Should show 3 private subnets (one per AZ)| Variable | Description | Default |
|---|---|---|
region |
AWS region | eu-north-1 |
environment |
Environment (dev/staging/production) | production |
kubernetes_version |
Kubernetes version | 1.34 |
instance_types |
Managed node group instance types | ["t3.medium", "t3.large"] |
primary_min_size |
Minimum nodes in managed group | 1 |
primary_max_size |
Maximum nodes in managed group | 2 |
primary_desired_size |
Desired nodes in managed group | 1 |
create_dns_zone |
Creates Route53 hosted zone, ACM certificate, and External-DNS | false |
is_aws_registered_domain |
Confirm whether your domain is registered in Route 53 | false |
dns_zone_name |
Domain name for Route53 zone (e.g., example.com). Required if create_dns_zone = true |
"" |
|
Cluster name is automatically generated as: eks-task-${environment}
Examples:
- Production:
eks-task-production - Staging:
eks-task-staging - Dev:
eks-task-dev
Default NodePool: Linux/amd64, Nitro instances (c/m/r families), 4-32 CPU cores, Bottlerocket AMI, 30-day expiration, WhenEmpty consolidation.
When create_dns_zone = true, the following are provisioned:
- Route53 Hosted Zone: DNS zone for your domain
- ACM Certificate: Wildcard certificate (
*.example.comandexample.com) with automatic DNS validation - External-DNS: Automatically manages Route53 records based on Kubernetes Ingress resources
- AWS Load Balancer Controller: Always deployed, manages ALB/NLB for Kubernetes Ingress
Use the certificate ARN in your Ingress annotations:
annotations:
alb.ingress.kubernetes.io/certificate-arn: <output from terraform output acm_certificate_arn>VPC (10.0.0.0/16)
├── Public Subnets (AZ-1, AZ-2, AZ-3)
│ └── NAT Gateway
├── Private Subnets (AZ-1, AZ-2, AZ-3)
│ └── EKS Worker Nodes (Karpenter-managed)
│ └── Managed Node Group (Karpenter controller)
└── Intra Subnets (AZ-1, AZ-2, AZ-3)
└── EKS Control Plane
# Check Karpenter logs
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter
# Verify NodePool is Ready
kubectl get nodepool default -o yaml | grep -A 5 "type: Ready"
# Verify EC2NodeClass can find resources
kubectl get ec2nodeclass default -o yaml | grep -A 10 "status:"
# Check for security group issues
kubectl get ec2nodeclass default -o jsonpath='{.status.conditions[?(@.type=="SecurityGroupsReady")]}'# Check addon status
aws eks describe-addon \
--cluster-name eks-task-production \
--addon-name aws-ebs-csi-driver \
--query 'addon.status'
# Check pod events
kubectl describe pod -n kube-system <pod-name>
# Check if pods are unscheduled
kubectl get pods -n kube-system -o wide | grep -v Running# Check node group status
aws eks describe-nodegroup \
--cluster-name eks-task-production \
--nodegroup-name <nodegroup-name>
# Check CloudWatch logs
aws logs tail /aws/eks/eks-task-production/cluster --followcd terraform
terraform destroy -var-file=terraform.prod.tfvarsWarning: This deletes the entire EKS cluster and all associated resources.
Key outputs available via terraform output:
cluster_name: EKS cluster namecluster_endpoint: EKS API endpointkubectl_config_command: Command to configure kubectlacm_certificate_arn: ACM certificate ARN (if DNS zone created)
- terraform-aws-modules/vpc/aws (~> 6.0)
- terraform-aws-modules/eks/aws (~> 21.9.0)
- terraform-aws-modules/eks/aws//modules/karpenter
- terraform-aws-modules/iam/aws (~> 5.28)