Skip to main content

EKS Best Practices

Architecture

  • Think about multi-tenancy, isolation for different environments or different workloads
    • Account-level isolation using AWS Organizations
    • Network-layer isolation, i.e. different VPC and different cluster
    • Use different node groups (node pools) for different purposes/categories, for example, create dedicated node groups for operational tools, such as CI/CD tools, monitoring tools, centralized logging system.
    • Separate namespace for different workloads

Reliability | Principles

  • Recommended to use dedicated VPC for EKS
  • Understand and verify EKS/Fargate Service Quotas and other related services
  • Implement the Cluster Autoscaler to automatically adjust the size of an EKS cluster up and down based on scheduling demands.
  • Consider the number of worker nodes and service degradation if there is node/AZ failure.
    • Take care of RTO.
    • Consider having a buffer node.
  • Consider not choosing a very large instance type to reduce the blast radius.
  • Enable Horizontal Pod Autoscaler to use CPU utilization or custom metrics to scale pods.
  • Use infrastructure as code (Kubernetes manifest files and templates to provision EKS clusters/nodes, etc.)
  • Use multiple AZs. Distribute application replicas across different worker node availability zones for redundancy
    • Be careful with your persistent pods that use EBS as PersistentVolume. Use annotation, for example, topology.kubernetes.io/zone=us-east-1c
  • Highly available and scalable worker nodes using Auto Scaling groups, use node groups
    • Consider using Managed Node Groups for easy setup and high availability of nodes during updates or termination
    • Consider using Fargate to avoid having to manage worker nodes. But be aware of Fargate limitations.
  • Consider separating Node Groups for your application and utility functions, for example. Log database, service mesh control plane
  • Deploy aws-node-termination-handler. It detects if the node will become unavailable/terminated, such as Spot Interruption, then ensures no new work is scheduled there and then drains it, removing any existing work. Tutorial | Announcement
  • Configure Pod Disruption Budgets (PDBs) to limit the number of pods of a replicated application that are down simultaneously from voluntary disruptions, for example, during updates, continuous deployment, and other use cases.
  • Use AWS Backup to back up EFS and EBS
  • Use EFS for storage class: using EFS does not require pre-provisioning capacity and allows more efficient pod migrations between worker nodes (removing node-attached storage)
  • Install Node Problem Detector to provide actionable data to heal clusters.
  • Avoid configuration errors, such as using anti-affinity, which causes the pod to be unable to reschedule due to node failure.
  • Use liveness and readiness probes
  • Practice chaos engineering, use available tools to automate.
    • Kill pods randomly during testing
  • Implement failure management at the microservice level, for example, circuit breaker pattern, control and limit retry calls (exponential backoff), throttling, make services stateless whenever possible
  • Practice how to upgrade the cluster and worker nodes to the new version.
    • Practice how to drain worker nodes.
  • Practice chaos engineering
  • Use CI/CD tools, automate and have process flow (approval/review) for infrastructure changes. Consider implementing GitOps.
  • Use multi-AZ solution for persistent volume, for example. Thanos+S3 for Prometheus

Performance Efficiency | Principles

  • Notify AWS support if you need to pre-scale the Control Plane (Master Node and Etcd) in case of sudden load increase
  • Choose the correct EC2 instance type for your worker node.
    • Understand the pros and cons of using many small node instances or few large node instances. Consider OS overhead, time required to pull the image on a new instance when it scales, kubelet overhead, system pod overhead, etc.
    • Understand pod density limitation (maximum number of pods supported by each instance type)
  • Use single-AZ node groups if necessary. Normally, one of the best practices is to run a microservice in Multi-AZ for availability, but for some workloads (such as Spark) that need microsecond latency, with high network I/O operations and transients, shipping is done to use single-AZ.
  • Understand the Fargate performance limitation. Do load testing before going to production.
  • Make sure your pod requests the necessary resources. Set resource request and limit such as CPU, memory
  • Detect bottleneck/latency in a microservice with X-Ray or other tracing/APM products
  • Choose the right storage backend. Use Amazon FSx for Lustre and its CSI Driver if your persistent container needs a high-performance file system
  • Monitor pod and node resource consumption and technology bottleneck. You can use CloudWatch, CloudWatch Container Insight or other products
  • If necessary, launch instances (worker nodes) in Placement Groups to take advantage of low latency without slowing down. You can use this CloudFormation template to add new node groups with non-blocking connectivity, no oversubscription, and fully bi-sectional.
  • If necessary, configure the Kubernetes CPU management policy as 'static' for some pods that need exclusive CPUs

Cost Optimization

  • Minimize wasted (unused) resources when using EC2 as worker node.
    • Choose the correct EC2 instance type and use cluster autoscaling.
    • Consider using Fargate
    • Consider using a tool like kube-resource-report to visualize slack cost and properly size requests for containers in a pod.
  • Use spot instances or mix on-demand and spot using Spot Fleet. Consider using spot instances for test/staging environment.
  • Use reserved instance or savings plans
  • Use single-AZ node groups for workload with high network I/O operations (for example, Spark) to reduce communication between AZ. But please validate that Single-AZ execution would not compromise your system's availability.
  • Consider managed services for support tools, such as monitoring, service mesh, centralized logging, to reduce your team's effort and cost
  • Tag all AWS resources when possible and use labels to tag Kubernetes resources so you can easily analyze cost.
  • Consider using self-managed Kubernetes (not using EKS) for cluster without HA. You can configure using Kops for your small k8s cluster.
  • Use Node Affinities using nodeSelector for pod that requires a specific EC2 instance type.

Operation: Principles

Security | Principles

Packer for AMI build : Packer configuration to build a custom EKS AMI