EKS - Terraform
Although it's not difficult to create a resource using aws_eks_cluster, I believe the best way to create an EKS cluster is by using the ready-made module maintained by the community https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest. Studying the module is sometimes the best way to learn Terraform. Don't use modules without actually understanding how they work. I only recommend using official and well-documented modules like this one.
It's worth remembering that Terraform's EKS module does what it's supposed to do, which is create the cluster, and for that it requires inputs like VPC, subnet, etc.
The advantage of this module is that it already includes some addons that make life easier, such as external-dns itself.
Several examples are here.
The knowledge gained with eksctl will help us better understand the arguments passed.
A complete example would be this one, including the creation of vpc, with subnets, and other resources, but I recommend separating the network creation project from the cluster creation project.
The ideal when working with Infrastructure as Code is to separate projects by layers. Each layer plays its well-defined role, with the lower layers serving as the foundation for the layers above. For example, a network project could be the foundation for the EKS project. This is a way to keep projects independent, making it easier to add and remove resources without affecting the entire environment, in addition to separating team responsibilities.
Recommendations
It's worth having good knowledge of Terraform in order to generate better results in project separation.
A Terraform resource I used for a long time is workspace, but it's not the best path even though Terraform offers this system. This method is only valid for identical environments such as staging and production. In the case of development where we usually spend fewer resources, it's necessary to work around some resources with ifs or counts. In this case, it's better to have more duplicated code separated by folders than using workspace.
For identical environments, use input files such as tfvars.
Always save the generated state in the cloud using the state-backend project. This project is for starting anything with Terraform.
At a later stage, look into using gitops for Terraform to track changes. Options would be Atlantis or Terraform Cloud.
For git versioning, create a strategy that everyone understands. I personally only use the main branch. A method I adapted to is that what's on main is what's created. Other branches should only be applied on top of main and only main should do the final apply.
Always pay attention to the module version you're using and the Terraform version.
Network
The module to be used will be https://github.com/terraform-aws-modules/terraform-aws-vpc
This module will provision our VPC and subnets and all the resources needed for everything to work well. If you already have a VPC and subnet in your infrastructure, note that there are tags on public and private subnets.
Why do tags exist on the subnet?
Using the complete EKS example, we'll create a project for the following module:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 3.0"
name = local.name
cidr = "10.0.0.0/16"
azs = ["${local.region}a", "${local.region}b", "${local.region}c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
intra_subnets = ["10.0.7.0/28", "10.0.7.16/28", "10.0.7.32/28"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
enable_flow_log = true
create_flow_log_cloudwatch_iam_role = true
create_flow_log_cloudwatch_log_group = true
public_subnet_tags = {
"kubernetes.io/cluster/${local.name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${local.name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
tags = local.tags
}
Analyze the network project and see how I organized the folder structure for this project as well as the outputs generated that will be used in the eks module later. Also note that the folder structure represents the path of the created tfstate. Since we'll only create 1 cluster later on, the project was created in the prod folder. If it were necessary to create an exclusive VPC for a development environment, we would start from the dev folder.
An important detail in this project is the single nat-gateway for all subnets by default to reduce costs, but it can be set to false in production to have one nat-gateway per subnet.
EKS
In this project, we'll only focus on creating eks, but the network project needs to be created first. An interesting fact to think about is whether we should create a cluster for development and another for production. The cost would increase significantly. I believe the ideal way for a low budget is to have a single group of master nodes to control several groups of worker nodes, and these groups can be divided into node groups for development and production. If you want to reduce costs even further, we can only use a single group and separate projects by namespaces. From the moment production requires a very exclusive resource, a new node group is created for development. We'll continue by making a single node group in the prod folder. If we were to have another cluster, we could start it in the dev folder.
It's possible to use a custom ami if you want, but we'll keep the eks default which is to use the Amazon Linux operating system.
There is an open source operating system called BottleRocket that is specifically used for deploying containers. A test will be done later with this system, but for now let's stick with the default.
The project continues in the terraform folder
For SSH access to the machines, we'll use an SSH key pair in the files folder. If the key path is not provided, the key in .ssh/id_rsa.pub and id_rsa on your local machine will be used.
As a best practice, create a key for accessing cluster hosts that is different from your personal key.
The variables used are in the terraform.tfvars file which populates the variables in variables.tf.
The remote.tf file references the network project.
Inside locals.tf we'll have local variables.
The CNI used is VPC CNI.
The main module is eks.tf which defines the eks cluster and node groups.
It's possible to separate node groups from this module if you want, but I preferred to keep everything together to make reading easier.
There are 3 concepts we need to understand about eks:
-
managed node groups: EKS is the one who will manage this worker group. We can specify several parameters as was done. I particularly prefer eks to do the management. This was the method implemented in the project proposed here. The node group name declared in the project was infrastructure-ng.
-
self managed node group: This is when we notify eks that there's already a worker group managed by the user and this computational resource is passed to eks to deploy its containers.
-
fargate profile: It's a node group also managed by eks but using Amazon's spare resources. This way the price is cheaper, but when necessary aws can requisition this resource. Good for development environments or even staging depending on the case.
A curiosity about what I think: kubernetes can be considered an infrastructure over the cloud. That's why this name was given to the node group.
If you're using aws, you'll probably use the ECR resource to store your images, because it's very cheap. Therefore, an access policy to ecr was created for use by all nodes and is declared in iam.tf.
It's worth comparing the resources created here with the resources created by eksctl using cloudformation, just for study purposes.
Access
https://docs.aws.amazon.com/eks/latest/userguide/getting-started-console.html#eks-configure-kubectl
It's necessary to have aws-cli installed on the system for the command below. Once your user has admin permission, the command below will already include all the necessary configuration in your ./kube/config. Change the region and cluster name if necessary.
aws eks update-kubeconfig --region us-east-1 --name us-east-1-prod-cluster