Skip to main content

Control Plane Failure Verification Sequence

  1. Check the state of nodes in the cluster and ensure all are ready. Run the command kubectl get nodes. If the command doesn't respond, there's probably a problem with kube-apiserver. In this case, you'll need to SSH into one of the cluster's master nodes to investigate.

    1.1. Check the state of pods in the cluster with the command kubectl get pods -n kube-system. This will ensure the control plane pods are running. If the pods aren't present, it's likely the deployment wasn't done using kubeadm.

    1.2. If the deployment wasn't done with kubeadm, it was probably configured as a service on the operating system. Each operating system manages processes differently. Although most use systemd nowadays, you can check with the command ps -p 1. Check the status with the commands systemctl status kube-apiserver, systemctl status controller-manager, systemctl status kube-scheduler on master nodes and on worker nodes, systemctl status kubelet, systemctl status kube-proxy.

    1.3. Check the controller logs. If the controllers are running as pods, use kubectl logs on each of them. If they're services, use journalctl -u service-name.

    1.4. Check certificates and their paths. In the case of pods, also check volume mount points.