Tag Archives: TKG

VMC Tanzu Header

VMware Cloud on AWS Deep Dive – Activating, Deploying and Using the managed Tanzu Kubernetes Grid Service

In this blog post I’m going to deep dive into the end-to-end activation, deployment, and consuming of the managed Tanzu Services (Tanzu Kubernetes Grid Service > TKGS) within a VMware Cloud on AWS SDDC. I’ll deploy a Tanzu Cluster inside a vSphere Namespace, and then deploy my trusty Pac-Man application and make it Publicly Accessible.

Previously to this capability, you would need to deploy Tanzu Kubernetes Grid to VMC, which was fully supported, as a Management Cluster and then additional Tanzu Clusters for your workloads. (See Terminology explanations here). This was a fully support option, however it did not provide you all the integrated features you could have by using the TKGS as part of your On-Premises vSphere environment.

What is Tanzu Services on VMC?

Tanzu Kubernetes Grid Service is a managed service built into the VMware Cloud on AWS vSphere environment.

This feature brings the availability of the integrated Tanzu Kubernetes Grid Service inside of vSphere itself, by coupling the platform together, you can easily deploy new Tanzu clusters, use the administration and authentication of vCenter, as well as provide governance and policies from vCenter as well.

Note: VMware Cloud on AWS does not enable activation of Tanzu Kubernetes Grid by default. Contact your account team for more information. 

Note2: In VMware Cloud on AWS, the Tanzu workload control plane can be activated only through the VMC Console.
But wait, couldn’t I already install a Tanzu Kubernetes Grid Cluster onto VMC anyway?

Tanzu Kubernetes Grid is a multi-cloud solution that deploys and manages Kubernetes clusters on your selected cloud provider. Previously to the vSphere integrated Tanzu offering for VMC that we are discussing today, you would deploy the general TKG option to your SDDC vCenter.

What differences should I know about this Tanzu Services offering in VMC versus the other Tanzu Kubernetes offering?
  • When Activated, Tanzu Kubernetes Grid for VMware Cloud on AWS is pre-provisioned with a VMC-specific content library that you cannot modify.
  • Tanzu Kubernetes Grid for VMware Cloud on AWS does not support vSphere Pods.
  • Creation of Tanzu Supervisor Namespace templates is not supported by VMware Cloud on AWS.
  • vSphere namespaces for Kubernetes releases are configured automatically during Tanzu Kubernetes Grid activation.
Activating Tanzu Kubernetes Grid Service in a VMC SDDC
Reminder: Tanzu Services Activation capabilities are not activated by default. Contact your account team for more information.

Within your VMC Console, you can either go via the Launchpad method or via the SDDC inventory item. I’ll cover both:

  • Click on Launchpad
  • Open the Kubernetes Tab
  • Click Learn More

VMC - Launchpad - Kubernetes

  • Select the Journey Tab
  • Under Stage 2 – Activate > Click Get Started

VMC - Launchpad - Kubernetes - Journey - Get started

Alternatively, from the SDDC object in the Inventory view

  • Click Actions
  • Click “Activate Tanzu Kubernetes Grid”

VMC - Inventory - SDDC - Activate Tanzu Kubernetes Grid

You will now be shown a status dialog, as VMC checks to ensure that Tanzu Kubernetes Grid Service can be activated in your cluster.

This will check you have the correct configurations and compute resources available.

VMC - Inventory - SDDC - Activate Tanzu Kubernetes Grid - Checking cluster resources

If the check is successful, you will now be presented the configuration wizard. Essentially, all you must provide is your configuration for four networks. Continue reading VMware Cloud on AWS Deep Dive – Activating, Deploying and Using the managed Tanzu Kubernetes Grid Service

Tanzu Nvidia Header

Deploying Nvidia GPU enabled Tanzu Kubernetes Clusters

In this blog post I’m going to detail how deploy and configure a Nvidia GPU enabled Tanzu Kubernetes Grid cluster in AWS. The method will be similar for Azure, for vSphere there are a number of additional steps to prepare the system. I’m going to essentially follow the official documentation, then run some of the Nvidia tests. Like always, it’s good to get a visual reference and such for these kinds of deployments.

Pre-Reqs
  • Nvidia today only support Ubuntu deployed images in relation to a TKG deployment
  • For this blog I’ve already deployed my TKG Management cluster in AWS
Deploy a GPU enabled workload cluster

It’s simple, just deploy a workload cluster that for the compute plane nodes (workers) that uses a GPU enabled instance.

You can create a new cluster YAML file from scratch, or clone one of your existing located in:

~/.config/tanzu/tkg/clusterconfigs

Below are the four main values you will need to change. As mentioned above, you need a GPU enabled instance, and for the OS to be Ubuntu. The OS version will default if not set to 20.04.

CONTROL_PLANE_MACHINE_TYPE: t3.large
NODE_MACHINE_TYPE: g4dn.xlarge
OS_ARCH: amd64
OS_NAME: ubuntu
OS_VERSION: "20.04

The rest of the file you configure as you would for any workload cluster deployment. Continue reading Deploying Nvidia GPU enabled Tanzu Kubernetes Clusters

Tanzu Blog Logo Header

Tanzu Kubernetes Grid – How to edit Node resources and Scale a Cluster Vertically With kubectl

In this blog post I am going to walk you through how to edit the Machine Resource configurations for nodes deployed by Tanzu Kubernetes Grid.

Example Issue – Disk Pressure

In my environment, I found I needed to alter my node resources, as several Pods were getting the evicted status in my cluster.

By running a describe on the pod, I could see the failure message was due to the node condition DiskPressure.

  • If you need to clean up a high number of pods across namespaces in your environment, see this blog post.
kubectl describe pod {name}

TKG - kubectl describe pod - failed - evicted - pod the node had condition disk pressure

I then looked at the node that the pod was scheduled too. (You can see this in the above screenshot, 4th line “node”).

Below we can see that on the node, Kubelet has tainted the node to stop further pods from being scheduled to this node.

In the events we see the message “Attempting to reclaim ephemeral-storage”

TKG - kubectl describle node - disk pressure

Configuring resources for Tanzu Kubernetes Grid nodes

First you will need to log into your Tanzu Kubernetes Grid Management Cluster, that was used to deploy the Workload (Guest) cluster. As this controls cluster deployments and holds the necessary bootstrap and machine creation configuration.

Once logged in, locate the existing VsphereMachineTemplate for your chosen cluster. Each cluster will have two configurations (one for Control Plane nodes, one for Compute plane/worker nodes).

If you have deployed TKG into a public cloud, then you can use the following types instead, and continue to follow this article as the theory is the same regardless of where you have deployed to:

  • AWSMachineTemplate on Amazon EC2
  • AzureMachineTemplate on Azure
kubectl get VsphereMachineTemplate

TKG - kubectl get VsphereMachineTemplate

You can attempt to directly alter this file, however, when trying to save the edited file, you will be presented with the following error message:

kubectl edit VsphereMachineTemplate tkg-wld-01-worker

error: vspheremachinetemplates.infrastructure.cluster.x-k8s.io "tkg-wld-01-worker" could not be patched: admission webhook "validation.vspheremachinetemplate.infrastructure.x-k8s.io" denied the request: spec: Forbidden: VSphereMachineTemplateSpec is immutable

TKG - kubectl edit VsphereMachineTemplate - Forbidden- VSphereMachineTemplateSpec is immutable

Instead, you must output the configuration to a local file and edit it. Also, you will need to remove the following fields if you are using my below method. Continue reading Tanzu Kubernetes Grid – How to edit Node resources and Scale a Cluster Vertically With kubectl

Tanzu Mission Control Header

Tanzu Mission Control – TKG Management support and provisioning new clusters

In this blog post, I am going to cover the new support for Tanzu Kubernetes Grid Management clusters on both VMware Cloud on AWS (VMC) and Azure VMware Solution (AVS). This functionality also allows the provisioning of new Tanzu Kubernetes workload clusters (TKC) to the relevant platform, provisioned by the lifecycle management controls within Tanzu Mission Control.

Below are the other blog posts I’ve wrote covering Tanzu Mission Control.

Tanzu Mission Control 
- Getting Started Tanzu Mission Control 
- Cluster Inspections 
- Workspaces and Policies  
- Data Protection 
- Deploying TKG clusters to AWS 
- Upgrading a provisioned cluster 
- Delete a provisioned cluster 
- TKG Management support and provisioning new clusters
- TMC REST API - Postman Collection
- Using custom policies to ensure Kasten protects a deployed application
Release Notes

Below are the relevant release notes for the features I’ll cover. In this blog post, I’ll just be showing screenshots for a VMC environment, however the same applies to AVS as well.

What's New May 26, 2021

New Features and Improvements

    (New Feature update): Tanzu Mission Control now supports the ability to register Tanzu Kubernetes Grid (1.3 & later) management clusters running in vSphere on Azure VMware Solution.

What's New April 30, 2021

New Features and Improvements

    (New Feature update): Tanzu Mission Control now supports the ability to register Tanzu Kubernetes Grid (1.2 & later) management clusters running in vSphere on VMware Cloud on AWS. For a list of supported environments, see Requirements for Registering a Tanzu Kubernetes Cluster with Tanzu Mission Control in VMware Tanzu Mission Control Concepts.
Prerequisites

This first management cluster deployment is not supported by TMC, nor is it supported for a management cluster to deploy workload clusters across platforms. For example, a management cluster running in AWS does not have the capability to deploy workload clusters to VMC or AVS or Azure.

The following requirements are from the product documentation.

  • The management cluster must be deployed as a production cluster with multiple control plane nodes
    • However, in my demo lab I was able to successfully run this using a development deployment.
  • Tanzu Kubernetes Grid workload clusters need at least 4 CPUs and 8 GB of memory
    • Again, I deployed a small instance type (2 vCPU, 4GB RAM) and this didn’t seem to be an issue.
  • Tanzu Kubernetes Grid management clusters (version 1.3 or later) running in vSphere on Azure VMware Solution (AVS).
  • Tanzu Kubernetes Grid management clusters (version 1.2 or later) running in vSphere, including vSphere on VMware Cloud on AWS (version 1.12 or 1.14).
  • Do not attempt to register any other kind of management cluster with Tanzu Mission Control.
  • Tanzu Mission Control does not support the registration of Tanzu Kubernetes Grid management clusters prior to version 1.2.
Registering our Tanzu Kubernetes Grid Management Cluster
  • Go to Administration> Management Clusters > Register Management Cluster > Tanzu Kubernetes Grid

Tanzu Mission Control - Administration - Register Management Cluster - Tanzu Kubernetes Grid Continue reading Tanzu Mission Control – TKG Management support and provisioning new clusters

VMware Tanzu Header

Deploying Tanzu Kubernetes Grid to AWS fails with ‘InstanceProvisionFailed’

The issue

When deploying Tanzu Kubernetes Grid to AWS, the deployment was failing with the following output:

unable to set up management cluster, : unable to wait for cluster and get the cluster kubeconfig: error waiting for cluster to be provisioned (this may take a few minutes): cluster creation failed, reason:'InstanceProvisionFailed @ Machine/tkg-aws-mgmt-control-plane-dqb4v', message:'1 of 2 completed'
The Cause

When we reviewed the CAPA logs (Cluster API AWS provider) we found the following errors logged: Continue reading Deploying Tanzu Kubernetes Grid to AWS fails with ‘InstanceProvisionFailed’