If you are running a bare-metal cluster you probably run kubernetes on top of some linux os, these systems have to be regularly updated. But an update means sometimes that you have to reboot your servers. This also means during a reboot that particular node is node available to schedule workload.
More importantly, you probably better cordon which mark a Node unschedulable and thus ensures that workloads are scheduled on un-cordoned nodes.
Current state
Even we use [Saltstack] to manage our nodes and [Rancher] for the cluster, we still have manual steps. For example we still manually cordon and un-cordon nodes - with 7 clusters and dozens of nodes that is cumbersome. As we cannot update (or better reboot) all servers at the same time, we have added a custom grain
called orch_seq
which allows us to do the upgrade happen in a certain sequence, ensuring enough worker nodes are available. The orch_seq
is a number from 1 to 7 and these sequences can be done together
- 1 (
x
) and 4 (y
) - 2 (
x
) and 5 (y
) - 3 (
x
) and 6 (y
)
Then we do this cluster by cluster using the grain
named context
. Usually we start with the dev
cluster which is the least sensible and then go further up to the production clusters
-
Drain the nodes of the sequence
x
andy
which arek8s_node
andrancher
-
Check which role the nods have
sudo salt -C '( [email protected]_seq:x or [email protected]_seq:y ) grains.get roles
-
Drain all nodes listed which are
k8s_node
andrancher
:
-
-
Upgrade the nodes of the sequence
x
andy
using saltsudo salt -C '( ( [email protected]_seq:x or [email protected]_seq:y ) [email protected]:xxxxx)' pkg.upgrade
We could run `pkg.upgrade` for all nodes in parallel, but we deliberately to this in sequence, cause in case something goes wrong with the package upgrade, we still have enough nodes available