Tanzu Blog Logo Header

Tanzu Kubernetes Grid – Upgrading a Management and Workload Cluster deployed to vSphere

In this blog post, I am going to walk through how to upgrade both your Tanzu Kubernetes Grid Management and Workload clusters. I’ll cover the Tanzu CLI options, as well as how you can leverage the features of Tanzu Mission Control for upgrades as well.

For my example use cases, I’ll be upgrading from TKG 1.4.2 to 1.5.4. Although the process should be similar for other upgrade paths, I do recommend you consult the official documentation before attempting any upgrade in case there are any changes.

Caution: VMware recommends not installing or upgrading to Tanzu Kubernetes Grid v1.5.0-v1.5.3, due to a bug in the versions of etcd in the versions of Kubernetes used by Tanzu Kubernetes Grid v1.5.0-v1.5.3. Tanzu Kubernetes Grid v1.5.4 resolves this problem by incorporating a fixed version of etcd. For more information, see Resolved Issues in the TKG v1.5 Release Notes.
Pre-requisites

To upgrade Tanzu Kubernetes Grid (TKG), you download and install the new version of the Tanzu CLI on the machine that you use as the bootstrap machine. You must also download and install base image templates and VMs, depending on whether you are upgrading clusters that you previously deployed to vSphere, Amazon EC2, or Azure.

Download the Tanzu CLI and Kubernetes OVAs

On the VMware Customer Portal download both the Tanzu CLI and OVA files as necessary.

I’ve highlighted in the below screenshot; your Management Cluster will always need to run the latest Kubernetes version.

Tanzu Kubernetes Grid - Upgrade - Download Product files - Tanzu CLI - Kubernetes OVAS

Upload Kubernetes OVAs to vCenter

Now upload your Kubernetes OVAs to vCenter and convert these to a template. Hopefully these steps should be well known, so I’ll leave just the screenshots for you to follow. (If you need the steps wrote out, drop me a comment on this post).

Tanzu Kubernetes Grid - Upgrade - Deploy OVF Template to vCenter  Tanzu Kubernetes Grid - Upgrade - Deploy OVF Template to vCenter - Select an OVF Template

Tanzu Kubernetes Grid - Upgrade - Deploy OVF Template to vCenter - Review

Tanzu Kubernetes Grid - Upgrade - Deploy OVF Template to vCenter - Ready to complete Tanzu Kubernetes Grid - Upgrade - Deploy OVF Template to vCenter - Convert to Template

Update the Tanzu CLI

Now we will upgrade our Tanzu CLI (or TKG CLI if you are version 1.3 or lower). In this blog post, I am using a Ubuntu jump-host, if you need to use Mac OS X or Windows, you can follow the steps on the link below:

First we need to remove the “tkg-compatability.yaml” file, if this is not done, the manifest will keep your versions at the same levels when you try to upgrade.

rm ~/.config/tanzu/tkg/compatibility/tkg-compatibility.yaml

Tanzu Kubernetes Grid - Upgrade - update the Tanzu CLI - remove tkg-compatibility.yaml

Unpack your Tanzu CLI tar file.

tar -zxvf tanzu-cli-bundle-linux-amd64.tar.gz

Tanzu Kubernetes Grid - Upgrade - update the Tanzu CLI - tar -zxvf tanzu-cli-bundle-linux-amd64.tar.gz

Move into the CLI folder and Install the Tanzu CLI tool.

cd cli

sudo install core/v0.11.6/tanzu-core-linux_amd64 /usr/local/bin/tanzu

Tanzu Kubernetes Grid - Upgrade - update the Tanzu CLI - core/v0.11.6/tanzu-core-linux_amd64 /usr/local/bin/tanzu

Now run the initialization command which will install the necessary plugins.

tanzu init

Tanzu Kubernetes Grid - Upgrade - update the Tanzu CLI - Tanzu init

Check your version shows “v0.11.6”

tanzu version

Tanzu Kubernetes Grid - Upgrade - update the Tanzu CLI - Tanzu version

For Kubectl I’m not showing the install, but if you are grabbing the version from the VMware Download pages, you can follow the steps here.

In my environment, I use a later version of Kubectl, due to managing lots of different environments. Therefore, I follow the official installation guide from Kubernetes.io.

Installing the Carvel Tools

Next up we need to install/update the Carvel Tools, once again, if you are doing this on an OS other than linux, you can use the official docs link above for the steps.

  • Carvel provides a set of reliable, single purpose, composable tools that aid in your application building, configuration, and deployment to Kubernetes.

If you are not in your CLI folder already, move into this folder, then run all the commands below.

cd cli

gunzip ytt-linux-amd64-v0.37.0+vmware.1.gz
sudo chmod ugo+x ytt-linux-amd64-v0.37.0+vmware.1
sudo mv ./ytt-linux-amd64-v0.37.0+vmware.1 /usr/local/bin/ytt

gunzip kapp-linux-amd64-v0.42.0+vmware.2.gz
sudo chmod ugo+x kapp-linux-amd64-v0.42.0+vmware.2
sudo mv ./kapp-linux-amd64-v0.42.0+vmware.2 /usr/local/bin/kapp

gunzip kbld-linux-amd64-v0.31.0+vmware.1.gz
sudo chmod ugo+x kbld-linux-amd64-v0.31.0+vmware.1
sudo mv ./kbld-linux-amd64-v0.31.0+vmware.1 /usr/local/bin/kbld

gunzip imgpkg-linux-amd64-v0.22.0+vmware.1.gz
sudo chmod ugo+x imgpkg-linux-amd64-v0.22.0+vmware.1
sudo mv ./imgpkg-linux-amd64-v0.22.0+vmware.1 /usr/local/bin/imgpkg

Tanzu Kubernetes Grid - Upgrade - update the Tanzu CLI - Install Carvel Tools

In the above screenshot, I also checked the version of the kapp and imgpkg tool for sanity.

Upgrading the Tanzu Kubernetes Grid Management Cluster

We are now ready to upgrade our Management Cluster.

Caution: VMware recommends not installing or upgrading to Tanzu Kubernetes Grid v1.5.0-v1.5.3, due to a bug in the versions of etcdin the versions of Kubernetes used by Tanzu Kubernetes Grid v1.5.0-v1.5.3. Tanzu Kubernetes Grid v1.5.4 resolves this problem by incorporating a fixed version of etcd. For more information, see Resolved Issues in the TKG v1.5 Release Notes.

You need to connect to your management cluster in the Tanzu CLI using the “Tanzu Login” command. However, using additional commands such as “Tanzu management-cluster get”, will fail. The only supported commands until the upgrade is completed are:

  • tanzu management-cluster upgrade
  • tanzu management-cluster create

You will hit issues like the below screenshot, when using the unsupported commands. This issue is also detailed in the official documentation.

Note: Note: After you have installed the v1.5 CLI but before a management cluster has been deployed or upgraded, all context-specific CLI command groups (tanzu cluster, tanzu kubernetes-release) plus all  the management-cluster plugin commands except for tanzu mc upgrade and tanzu mc create are unavailable and not included in Tanzu CLI --help output.

Tanzu Kubernetes Grid - Upgrade the Management Cluster - Tanzu CLI commands unavailable

Once you have logged in, to upgrade the Management cluster, run:

tanzu management-cluster upgrade

# Short command
tanzu mc upgrade

This will output logging information to your console up until completion.

Sometimes it may take longer than the default timeout settings of 30 minutes to complete, if this is the case, you can re-run the following command with the added argument.

tanzu mc upgrade --timeout 45m0s

Below I’ve also included some screenshots of my terminal, where I’ve run the upgrade command, my vSphere environment, where you can see an additional control plane node being deployed (this is a dev instance with a single node), the tasks in vSphere, and the Web Console of the new Control Plane node as it spins up and bootstraps itself to the cluster.

You may note my terminal window is different. I had to move to another jump host, due to an issue with my original one. Everything else is the same as above.

Tanzu Kubernetes Grid - Upgrade the Management Cluster - Tanzu CLI commands - tanzu mc upgrade

Tanzu Kubernetes Grid - Upgrade the Management Cluster - Hosts and Clusters View - New Control Plane VM

Tanzu Kubernetes Grid - Upgrade the Management Cluster - vSphere Web Console

As I was moving to TKG 1.5.4, I did hit the known issue, and resolved it by following the steps (apart from step 6, as that’s not needed, I’ve submitted feedback on the page), you can see my terminal commands and outputs in the screenshot below:

After upgrading a cluster with fewer than 3 control plane nodes, such as dev plan clusters, kapp-controller fails to reconcile its CSI package

Tanzu Kubernetes Grid - Upgrade the Management Cluster - After upgrading a cluster with fewer than 3 control plane nodes, such as dev plan clusters, kapp-controller fails to reconcile its CSI package

Upgrading the Tanzu Kubernetes Grid Workload Cluster
via CLI

Now our management cluster is upgraded, this will unlock the ability to upgrade your workload clusters.

To upgrade the cluster, is a single command, which will upgrade to the latest version available. I’ve also included the command to choose your version as an example

tanzu cluster upgrade {cluster_name} --timeout 45m0s

# To specify the Kubernetes version to upgrade to
tanzu cluster upgrade {cluster_name} -tkr {tanzu kubernetes runtime version

Tanzu Kubernetes Grid - Upgrade the Workload Cluster - Tanzu CLI commands - tanzu cluster upgrade cluster_name --timeout 45m0s

If I run a second terminal window, I can get details about the cluster during the upgrade. In the below section I also show some additional commands.

tanzu cluster get {cluster_name}

Tanzu Kubernetes Grid - Upgrade the Workload Cluster - Tanzu CLI commands - tanzu cluster get cluster_name

via Tanzu Mission Control

An alternative way to upgrade your workload clusters is to use Tanzu Mission Control, the fleet management SaaS service from VMware.

In the below screenshot, we can see that my cluster has a flag to identify it can be upgraded.

Tanzu Kubernetes Grid - Upgrade the Workload Cluster Using Tanzu Mission Control - TMC - Cluster Upgrade Available

Clicking into the cluster, I can again validate the current version, and have the “Upgrade” button, which will open a dialog window when clicked.

Tanzu Kubernetes Grid - Upgrade the Workload Cluster Using Tanzu Mission Control - TMC - Cluster - Upgrade

In the dialog window, you’ll be able to select from the available Kubernetes versions in your vCenter, remember these OVAs need to uploaded and marked as templates, to show in the drop down.

Select as necessary and click upgrade.

Tanzu Kubernetes Grid - Upgrade the Workload Cluster Using Tanzu Mission Control - TMC - Cluster - Upgrade Cluster - Select Version

After a minute or so, your Tanzu Cluster page will show an upgrading progress bar (you might need to refresh the page).

Clicking on the events tab you’ll see progress updates provided, you will need to refresh the page to see updated events list.

Tanzu Kubernetes Grid - Upgrade the Workload Cluster Using Tanzu Mission Control - TMC - Cluster - Upgrade Cluster - Events

In your terminal window, you can further check a cluster upgrade and get more information by running the following commands:

tanzu cluster get {cluster_name} --show-all-conditions all

Tanzu Kubernetes Grid - Upgrade the Workload Cluster Using Tanzu Mission Control - TMC - Upgrade Cluster - tanzu get cluster cluster_name --show-all-conditions all

tanzu cluster get {cluster_name} --show-group-members

Tanzu Kubernetes Grid - Upgrade the Workload Cluster Using Tanzu Mission Control - TMC - Upgrade Cluster - tanzu get cluster cluster_name --show-group-members

The final screenshot is once the upgrade process is completed, so you can see all the events that are typically listed, as well as my overview page now showing the updated Kubernetes version.

Tanzu Kubernetes Grid - Upgrade the Workload Cluster Using Tanzu Mission Control - TMC - Cluster - Upgrade Cluster - Events - Success

Troubleshooting an upgrade

There are three areas you will need to check for the deployment process, which will provide logs for you to understand the upgrade process:

  • Tanzu Kubernetes Grid Management Cluster
    • CAPI
    • CAPA/V/Z
      • Depending on platform:
        • CAPA – Cluster API Provider for AWS
        • CAPV – Cluster API Provider for vSphere
        • CAPZ – Cluster API Provider for Azure
  • Kubernetes node itself

To get the logs from the management cluster:

# Find the pod name in the CAPI and CAPV namespaces
kubectl get pods -n capi-system
kubectl get pods -n capv-system

# Retrieve the logs and output to a file for CAPI
kubectl logs capi-controller-manager-{id} -n capi-system > capi.log

## For example

kubectl logs  capi-controller-manager-65c5769c4c-8bgkd -n capi-system > capi.log

# Retrieve the logs and output to a file for CAPV
kubectl logs capv-controller-manager-{id} -n capv-system > capv.log

## For example

kubectl logs capv-controller-manager-75bdbfb7dc-bbkdh -n capv-system > capv.log

Now for the Kubernetes node, typically, you need to investigate the new nodes created by the Infrastructure Provider of CAPI (CAPV, CAPA, CAPZ). Once they are built, you can SSH onto the node, and view the cloud-init-output.logs which show you the commands run to bootstrap the node to the cluster.

# SSH into the node
ssh capv@{ip_address}

# Change to root
sudo -i

# View the cloud-init-output.log file
cat /var/log/cloud-init-output.log

Tanzu Kubernetes Grid - Troubleshooting an upgrade - ssh to node - cat var log cloud-init-output.log

Regards

Dean Lewis

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.