EKS Best Practices

Architecture

Think about multi-tenancy, isolation for different environments or different workloads
- Account-level isolation using AWS Organizations
- Network-layer isolation, i.e. different VPC and different cluster
- Use different node groups (node pools) for different purposes/categories, for example, create dedicated node groups for operational tools, such as CI/CD tools, monitoring tools, centralized logging system.
- Separate namespace for different workloads

Reliability | Principles

Recommended to use dedicated VPC for EKS
- Modular and Scalable Amazon EKS Architecture
- Plan your VPC and subnet CIDR, avoid the complexity of using multiple CIDRs in a VPC and CNI custom networking
Understand and verify EKS/Fargate Service Quotas and other related services
Implement the Cluster Autoscaler to automatically adjust the size of an EKS cluster up and down based on scheduling demands.
Consider the number of worker nodes and service degradation if there is node/AZ failure.
- Take care of RTO.
- Consider having a buffer node.
Consider not choosing a very large instance type to reduce the blast radius.
Enable Horizontal Pod Autoscaler to use CPU utilization or custom metrics to scale pods.
Use infrastructure as code (Kubernetes manifest files and templates to provision EKS clusters/nodes, etc.)
Use multiple AZs. Distribute application replicas across different worker node availability zones for redundancy
- Be careful with your persistent pods that use EBS as PersistentVolume. Use annotation, for example, topology.kubernetes.io/zone=us-east-1c
Highly available and scalable worker nodes using Auto Scaling groups, use node groups
- Consider using Managed Node Groups for easy setup and high availability of nodes during updates or termination
- Consider using Fargate to avoid having to manage worker nodes. But be aware of Fargate limitations.
Consider separating Node Groups for your application and utility functions, for example. Log database, service mesh control plane
Deploy aws-node-termination-handler. It detects if the node will become unavailable/terminated, such as Spot Interruption, then ensures no new work is scheduled there and then drains it, removing any existing work. Tutorial | Announcement
Configure Pod Disruption Budgets (PDBs) to limit the number of pods of a replicated application that are down simultaneously from voluntary disruptions, for example, during updates, continuous deployment, and other use cases.
Use AWS Backup to back up EFS and EBS
Use EFS for storage class: using EFS does not require pre-provisioning capacity and allows more efficient pod migrations between worker nodes (removing node-attached storage)
Install Node Problem Detector to provide actionable data to heal clusters.
Avoid configuration errors, such as using anti-affinity, which causes the pod to be unable to reschedule due to node failure.
Use liveness and readiness probes
Practice chaos engineering, use available tools to automate.
- Kill pods randomly during testing
Implement failure management at the microservice level, for example, circuit breaker pattern, control and limit retry calls (exponential backoff), throttling, make services stateless whenever possible
Practice how to upgrade the cluster and worker nodes to the new version.
- Practice how to drain worker nodes.
Practice chaos engineering
Use CI/CD tools, automate and have process flow (approval/review) for infrastructure changes. Consider implementing GitOps.
Use multi-AZ solution for persistent volume, for example. Thanos+S3 for Prometheus

Performance Efficiency | Principles

Notify AWS support if you need to pre-scale the Control Plane (Master Node and Etcd) in case of sudden load increase
Choose the correct EC2 instance type for your worker node.
- Understand the pros and cons of using many small node instances or few large node instances. Consider OS overhead, time required to pull the image on a new instance when it scales, kubelet overhead, system pod overhead, etc.
- Understand pod density limitation (maximum number of pods supported by each instance type)
Use single-AZ node groups if necessary. Normally, one of the best practices is to run a microservice in Multi-AZ for availability, but for some workloads (such as Spark) that need microsecond latency, with high network I/O operations and transients, shipping is done to use single-AZ.
Understand the Fargate performance limitation. Do load testing before going to production.
Make sure your pod requests the necessary resources. Set resource request and limit such as CPU, memory
Detect bottleneck/latency in a microservice with X-Ray or other tracing/APM products
Choose the right storage backend. Use Amazon FSx for Lustre and its CSI Driver if your persistent container needs a high-performance file system
Monitor pod and node resource consumption and technology bottleneck. You can use CloudWatch, CloudWatch Container Insight or other products
If necessary, launch instances (worker nodes) in Placement Groups to take advantage of low latency without slowing down. You can use this CloudFormation template to add new node groups with non-blocking connectivity, no oversubscription, and fully bi-sectional.
If necessary, configure the Kubernetes CPU management policy as 'static' for some pods that need exclusive CPUs

Cost Optimization

Minimize wasted (unused) resources when using EC2 as worker node.
- Choose the correct EC2 instance type and use cluster autoscaling.
- Consider using Fargate
- Consider using a tool like kube-resource-report to visualize slack cost and properly size requests for containers in a pod.
Use spot instances or mix on-demand and spot using Spot Fleet. Consider using spot instances for test/staging environment.
Use reserved instance or savings plans
Use single-AZ node groups for workload with high network I/O operations (for example, Spark) to reduce communication between AZ. But please validate that Single-AZ execution would not compromise your system's availability.
Consider managed services for support tools, such as monitoring, service mesh, centralized logging, to reduce your team's effort and cost
Tag all AWS resources when possible and use labels to tag Kubernetes resources so you can easily analyze cost.
Consider using self-managed Kubernetes (not using EKS) for cluster without HA. You can configure using Kops for your small k8s cluster.
Use Node Affinities using nodeSelector for pod that requires a specific EC2 instance type.

Operation: Principles

Use IaC tool to provision EKS cluster, such as
- CloudFormation.
  - Reference deployment
  - Deploy self-managed nodes
- Terraform
- Eksctl
- AWS CDK
Consider using package manager like Helm to help you install and manage applications.
Automate cluster management and application deployment using GitOps. You can use tools like Flux or others
Use CI/CD tools
Practice doing EKS upgrade (rolling update), create the runbook.
- GitHub - hellofresh/eks-rolling-update: EKS Rolling Update is a utility for updating worker node launch configuration in an EKS cluster.
- Open Sourcing EKS Rolling Update: a tool for updating Amazon EKS clusters
Monitoring
- Understand the health of your workload. Define KPI/SLO and metrics/SLI and then monitor through your dashboard and configure alerts
- Understand your Operational Health. Define KPI and metrics such as mean time to detect an incident (MTTD) and mean time to recovery (MTTR) from an incident.
- Use detailed monitoring using Container Insights for EKS to detail service, pod performance. It also provides diagnostic information and considers viewing additional metrics and additional levels of granularity when a problem occurs.
- Monitor control plane metrics using Prometheus
- Monitoring using Prometheus & Grafana
Logging
- Consider DaemonSet vs Sidecar mechanism. DaemonSet is preferable for EC2 worker nodes, but you need to use the Sidecar pattern for Fargate.
- Control plane logging
- You can use EFK stack or FluentBit, Kinesis Data Firehose, S3 and Athena
Tracing
- Monitor fine-grained transaction using X-Ray eksworkshop.com. It's also good to monitor blue-green deployment. Other tools
Practice Chaos Engineering, you can automate using some tools
Configuration
- Appmesh + EKS demo / lab: GitHub - PaulMaddox/aws-appmesh-helm: AWS App Mesh ❤ K8s
- AWS Cloud Map:
  - AWS Cloud Map: Easily create and maintain custom maps of your applications | AWS News Blog
  - AWS CloudMap + Consul:

Security | Principles

Understand the shared responsibility model for different EKS operating modes (self-managed nodes, managed node groups, Fargate)
AWS security best practices for EKS
Integrating security into your container pipeline | workshop
Use CNI custom networking, if your pod needs to have a different security group with its nodes or pods to be placed in private subnets, but the node is actually in a public subnet.
EKS CloudTrail API Log
- Consider enabling continuous delivery of CloudTrail events to an Amazon S3 bucket
Use network policy for East-West traffic: Calico
Use security groups for pods only for K8s > v1.17. See some considerations
Introducing fine-grained IAM roles for service accounts | AWS Open Source Blog

Packer for AMI build : Packer configuration to build a custom EKS AMI

Architecture​

Reliability | Principles​

Performance Efficiency | Principles​

Cost Optimization​

Operation: Principles​

Security | Principles​

Architecture

Reliability | Principles

Performance Efficiency | Principles

Cost Optimization

Operation: Principles

Security | Principles