EKS Best Practices
Architecture
- Think about multi-tenancy, isolation for different environments or different workloads
- Account-level isolation using AWS Organizations
- Network-layer isolation, i.e. different VPC and different cluster
- Use different node groups (node pools) for different purposes/categories, for example, create dedicated node groups for operational tools, such as CI/CD tools, monitoring tools, centralized logging system.
- Separate namespace for different workloads
Reliability | Principles
- Recommended to use dedicated VPC for EKS
- Modular and Scalable Amazon EKS Architecture
- Plan your VPC and subnet CIDR, avoid the complexity of using multiple CIDRs in a VPC and CNI custom networking
- Understand and verify EKS/Fargate Service Quotas and other related services
- Implement the Cluster Autoscaler to automatically adjust the size of an EKS cluster up and down based on scheduling demands.
- Consider the number of worker nodes and service degradation if there is node/AZ failure.
- Take care of RTO.
- Consider having a buffer node.
- Consider not choosing a very large instance type to reduce the blast radius.
- Enable Horizontal Pod Autoscaler to use CPU utilization or custom metrics to scale pods.
- Use infrastructure as code (Kubernetes manifest files and templates to provision EKS clusters/nodes, etc.)
- Use multiple AZs. Distribute application replicas across different worker node availability zones for redundancy
- Be careful with your persistent pods that use EBS as PersistentVolume. Use annotation, for example,
topology.kubernetes.io/zone=us-east-1c
- Be careful with your persistent pods that use EBS as PersistentVolume. Use annotation, for example,
- Highly available and scalable worker nodes using Auto Scaling groups, use node groups
- Consider using Managed Node Groups for easy setup and high availability of nodes during updates or termination
- Consider using Fargate to avoid having to manage worker nodes. But be aware of Fargate limitations.
- Consider separating Node Groups for your application and utility functions, for example. Log database, service mesh control plane
- Deploy aws-node-termination-handler. It detects if the node will become unavailable/terminated, such as Spot Interruption, then ensures no new work is scheduled there and then drains it, removing any existing work. Tutorial | Announcement
- Configure Pod Disruption Budgets (PDBs) to limit the number of pods of a replicated application that are down simultaneously from voluntary disruptions, for example, during updates, continuous deployment, and other use cases.
- Use AWS Backup to back up EFS and EBS
- Use EFS for storage class: using EFS does not require pre-provisioning capacity and allows more efficient pod migrations between worker nodes (removing node-attached storage)
- Install Node Problem Detector to provide actionable data to heal clusters.
- Avoid configuration errors, such as using anti-affinity, which causes the pod to be unable to reschedule due to node failure.
- Use liveness and readiness probes
- Practice chaos engineering, use available tools to automate.
- Kill pods randomly during testing
- Implement failure management at the microservice level, for example, circuit breaker pattern, control and limit retry calls (exponential backoff), throttling, make services stateless whenever possible
- Practice how to upgrade the cluster and worker nodes to the new version.
- Practice how to drain worker nodes.
- Practice chaos engineering
- Use CI/CD tools, automate and have process flow (approval/review) for infrastructure changes. Consider implementing GitOps.
- Use multi-AZ solution for persistent volume, for example. Thanos+S3 for Prometheus
Performance Efficiency | Principles
- Notify AWS support if you need to pre-scale the Control Plane (Master Node and Etcd) in case of sudden load increase
- Choose the correct EC2 instance type for your worker node.
- Understand the pros and cons of using many small node instances or few large node instances. Consider OS overhead, time required to pull the image on a new instance when it scales, kubelet overhead, system pod overhead, etc.
- Understand pod density limitation (maximum number of pods supported by each instance type)
- Use single-AZ node groups if necessary. Normally, one of the best practices is to run a microservice in Multi-AZ for availability, but for some workloads (such as Spark) that need microsecond latency, with high network I/O operations and transients, shipping is done to use single-AZ.
- Understand the Fargate performance limitation. Do load testing before going to production.
- Make sure your pod requests the necessary resources. Set resource
requestandlimitsuch as CPU, memory - Detect bottleneck/latency in a microservice with X-Ray or other tracing/APM products
- Choose the right storage backend. Use Amazon FSx for Lustre and its CSI Driver if your persistent container needs a high-performance file system
- Monitor pod and node resource consumption and technology bottleneck. You can use CloudWatch, CloudWatch Container Insight or other products
- If necessary, launch instances (worker nodes) in Placement Groups to take advantage of low latency without slowing down. You can use this CloudFormation template to add new node groups with non-blocking connectivity, no oversubscription, and fully bi-sectional.
- If necessary, configure the Kubernetes CPU management policy as 'static' for some pods that need exclusive CPUs
Cost Optimization
- Minimize wasted (unused) resources when using EC2 as worker node.
- Choose the correct EC2 instance type and use cluster autoscaling.
- Consider using Fargate
- Consider using a tool like kube-resource-report to visualize slack cost and properly size requests for containers in a pod.
- Use spot instances or mix on-demand and spot using Spot Fleet. Consider using spot instances for test/staging environment.
- Use reserved instance or savings plans
- Use single-AZ node groups for workload with high network I/O operations (for example, Spark) to reduce communication between AZ. But please validate that Single-AZ execution would not compromise your system's availability.
- Consider managed services for support tools, such as monitoring, service mesh, centralized logging, to reduce your team's effort and cost
- Tag all AWS resources when possible and use labels to tag Kubernetes resources so you can easily analyze cost.
- Consider using self-managed Kubernetes (not using EKS) for cluster without HA. You can configure using Kops for your small k8s cluster.
- Use Node Affinities using nodeSelector for pod that requires a specific EC2 instance type.
Operation: Principles
- Use IaC tool to provision EKS cluster, such as
- Consider using package manager like Helm to help you install and manage applications.
- Automate cluster management and application deployment using GitOps. You can use tools like Flux or others
- Use CI/CD tools
- Practice doing EKS upgrade (rolling update), create the runbook.
- Monitoring
- Understand the health of your workload. Define KPI/SLO and metrics/SLI and then monitor through your dashboard and configure alerts
- Understand your Operational Health. Define KPI and metrics such as mean time to detect an incident (MTTD) and mean time to recovery (MTTR) from an incident.
- Use detailed monitoring using Container Insights for EKS to detail service, pod performance. It also provides diagnostic information and considers viewing additional metrics and additional levels of granularity when a problem occurs.
- Monitor control plane metrics using Prometheus
- Monitoring using Prometheus & Grafana
- Logging
- Consider DaemonSet vs Sidecar mechanism. DaemonSet is preferable for EC2 worker nodes, but you need to use the Sidecar pattern for Fargate.
- Control plane logging
- You can use EFK stack or FluentBit, Kinesis Data Firehose, S3 and Athena
- Tracing
- Monitor fine-grained transaction using X-Ray eksworkshop.com. It's also good to monitor blue-green deployment. Other tools
- Practice Chaos Engineering, you can automate using some tools
- Configuration
- Appmesh + EKS demo / lab: GitHub - PaulMaddox/aws-appmesh-helm: AWS App Mesh ❤ K8s
- AWS Cloud Map:
- AWS Cloud Map: Easily create and maintain custom maps of your applications | AWS News Blog
- AWS CloudMap + Consul:
Security | Principles
- Understand the shared responsibility model for different EKS operating modes (self-managed nodes, managed node groups, Fargate)
- AWS security best practices for EKS
- Integrating security into your container pipeline | workshop
- Use CNI custom networking, if your pod needs to have a different security group with its nodes or pods to be placed in private subnets, but the node is actually in a public subnet.
- EKS CloudTrail API Log
- Consider enabling continuous delivery of CloudTrail events to an Amazon S3 bucket
- Use network policy for East-West traffic: Calico
- Use security groups for pods only for K8s > v1.17. See some considerations
- Introducing fine-grained IAM roles for service accounts | AWS Open Source Blog
Packer for AMI build : Packer configuration to build a custom EKS AMI