Kubernetes Upgrade v1.23 to v1.24 – Common Errors and Solutions

Unlike the common notion of “if it ain’t broke, don’t fix it”, I thought of upgrading my lab Kubernetes cluster from v1.23 to v1.24, just for fun. Now since it was a lab, it’s easier for me to decide and simply go on with it, but for production, of course, you need to decide if the upgrade is really required, and if yes, what are your backup plans? You need to strategies on if you have a standby cluster, if all applications and microservices have been backed up and if your control plane cluster can be rolled back. For information, my all pods restarted during the upgrade and came back up with different IP, but I didn’t touch anything for my cluster other than to check continuously if all pods were up or not. (Use the “watch” command for it, please stop banging Up Arrow, and Enter).

Is Docker’s support with Kubernetes really ended now?

One of the common doubts with the Kubernetes v1.24 release is, is docker really not supported now? Well, it depends on what you mean, when you use the term docker. For noobs, like me, Do you mean, docker containers, because docker means containers right? then NO. For amateur Engineers, again like me, docker runtime engine? then, partial YES. Kubernetes v1.24 have stopped supporting the dockershim, which was the interface between kubelet and containerd. So, now to let kubelet create a container for your Pod, on request from the control node, kubelet must be able to communicate with containerd directly on the CRI or container runtime interface.

It is worth noting here that containerd was anyhow part of docker installation previously, and was the actual behind the scene service, which was helping the lifecycle of the containers. Remember the steps of docker installation in the official guide,

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin

Hence, in essence, we are effectively removing a mediator between Kubelet and Containerd. Now, let me share my upgrade steps, and the errors I encountered in the correct timeline.

Do note that these steps, if followed, will avoid initializing or resetting your cluster, which many solutions you find only will tell you to do.

Error # 1

I encountered the below error when I upgraded my lab without considering the existing dockershim sitting between the kubelet and containerd for my cluster. Or, should I say, I really wanted to see what’s gonna happen, so I upgraded my cluster using the steps provided in the official Kubernetes documentation, with docker still intact in my master and worker nodes.

[ERROR CRI]: container runtime is not running: output: time="2022-08-22T02:06:28Z" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"

So, the error clearly shows there is a need for running container runtime, but containerd is active, so likely scenario, kubelet cannot see if containerd is running. Hence, we need to enable to container runtime interface (CRI) plugin for containerd, reload the service and restart the kubelet to come up and connect with the containerd. It is mentioned in step 1 of my upgrade plan below.

Upgrade Steps

Here are the upgrade steps for Master Node that I ran in my cluster. Step 1, is the solution for Error# 1 that I mentioned above.

Step: 1 Enable the CRI plugin, to make containerd communicate with the kubelet directly.

root@ubmaster:# vi /etc/containerd/config.toml
#Commented out cri from disabled_plugins
#disabled_plugins: ["cri"]
disabled_plugins: [""]
root@ubmaster:# systemctl daemon-reload
root@ubmaster:# systemctl restart containerd
root@ubmaster:# systemctl status containerd

root@ubmaster:# systemctl restart kubelet
root@ubmaster:# systemctl status kubelet

The above step will reboot your containers and need to be conducted on all the nodes in the cluster.

Step: 2 Kubeadm Upgrade (Non-Service Effecting for Pods)

root@ubmaster:# apt-mark unhold kubeadm && apt update
root@ubmaster:# apt-cache madison kubeadm && apt update
root@ubmaster:# apt-get install -y kubeadm=1.24.4-00
root@ubmaster:# apt-mark hold kubeadm 
root@ubmaster:# kubeadm upgrade plan
root@ubmaster:# kubeadm upgrade apply v1.24.4

Step: 3 Kubelet and Kubectl Upgrade (Service Effecting for Pods)

root@ubmaster:# kubectl drain ubmaster  --ignore-daemonsets
root@ubmaster:# apt-mark unhold kubelet kubectl
root@ubmaster:# apt-get update && apt-get install -y kubelet=1.24.4-00 kubectl=1.24.4-00 && apt-mark hold kubelet kubectl
root@ubmaster:# sudo systemctl daemon-reload
root@ubmaster:# sudo systemctl restart kubelet
root@ubmaster:# sudo systemctl status kubelet

Error # 2

It is not a highway ride, once you upgrade your cluster, as the kubelet service will not come up right away. Below is the error you will encounter, because of which your node will be in NotReady state, even though all packages are upgraded. Step 4 is the solution for this error.

root@ubmaster:# sudo systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
     Active: activating (auto-restart) (Result: exit-code) since Sat 2022-08-20 14:04:48 UTC; 2s ago

Aug 20 14:04:48 ubmaster kubelet[152392]: Error: failed to parse kubelet flag: unknown flag: --network-plugin

Error # 3

There are a few blogs mentioning removing only “–network-plugin=cni” from the kubeadm-flags.env file in the kubelet directory. But it’s not successful as you will get below error and kubelet will remain down.

root@ubmaster:# journalctl -xe | kubelet

W0820 14:15:31.786693  154117 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {  <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial unix: missing address". Reconnecting...
Error: failed to run Kubelet: unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix: missing address"

Step: 4 Starting Kubelet service after failure

To resolve both Error 2 and 3, we need to comment out the existing environment variable used by the kubelet service and add updated parameters.

root@ubmaster:# vi /var/lib/kubelet/kubeadm-flags.env
#KUBELET_KUBEADM_ARGS="--network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.6"
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --pod-infra-

All services must be up now, wait for pods to come up as well. And repeat the same steps on all nodes including Master and Workers. That’s all folks. Enjoy.

The session logs for both nodes in my Github repo, https://github.com/dmagnate/k8upgrade


One thought on “Kubernetes Upgrade v1.23 to v1.24 – Common Errors and Solutions

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s