Category Archives: Kubernetes

How to Install and configure vSphere CSI Driver on OpenShift 4.x

July 27, 2020Kubernetes, VMwareCSI, openshift, vSphereDean

Note2: December 2021 VMware released the Red Hat Certified Operator "vSphere Kubernetes Driver Operator", which is now the preferred and recommended way to install CPI and CSI in your OpenShift environment.
- Using the new vSphere Kubernetes Driver Operator with Red Hat OpenShift via Operator Hub

Note: This blog post was updated in February 2021 to use the new driver manifests from the Official VMware CSI Driver repository, which now provides support for OpenShift

Introduction

In this post I am going to install the vSphere CSI Driver version 2.1.0 with OpenShift 4.x, in my demo environment I’m connecting to a VMware Cloud on AWS SDDC and vCenter, however the steps are the same for an on-prem deployment.

We will be using the vSphere CSI Driver which now supports OpenShift.

My GitHub repo has been updated to reflect the changes of using the official manifests – vSphere-CSI-Driver-2.0-OpenShift-4

- Pre-Reqs
- - vCenter Server Role
- - Download the deployment files
- - Create the vSphere CSI secret in OpenShift
- - Create Roles, ServiceAccount and ClusterRoleBinding for vSphere CSI Driver
- Installation
- - Install vSphere CSI driver
- - Verify Deployment
- Create a persistent volume claim
- Using Labels
- Troubleshooting

In your environment, cluster VMs will need “disk.enableUUID” and VM hardware version 15 or higher.

Pre-Reqs

vCenter Server Role

In my environment I will use the default administrator account, however in production environments I recommend you follow a strict RBAC procedure and configure the necessary roles and use a dedicated account for the CSI driver to connect to your vCenter.

To make life easier I have created a PowerCLI script to create the necessary roles in vCenter based on the vSphere CSI documentation;

Create_CSI_Driver_vCenter_Roles.ps1

Download the deployment files

Run the following;

git clone https://github.com/saintdle/vSphere-CSI-Driver-2.0-OpenShift-4.git

Create the vSphere CSI Secret + CPI ConfigMap in OpenShift

Edit the two files “csi-vsphere.conf” + “vsphere.conf” with your vCenter infrastructure details. These two files may have the same information in them, but in the example of using VSAN File Services, then you may include further configuration in your CSI conf file, as an example.

[Global]
 
# run the following on your OCP cluster to get the ID
# oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}'
#Your OCP cluster name provided below can just be a human readable name but needs to be unique when running different OCP clusters on the same vSphere environment.
cluster-id = "OCP_CLUSTER_ID"

[VirtualCenter "VC_FQDN"]
insecure-flag = "true"
user = "USER"
password = "PASSWORD"
port = "443"
datacenters = "VC_DATACENTER"

Create the CSI secret + CPI configmap;

oc create secret generic vsphere-config-secret --from-file=csi-vsphere.conf --namespace=kube-system

oc create configmap cloud-config --from-file=vsphere.conf --namespace=kube-system

To validate:
oc get secret vsphere-config-secret --namespace=kube-system
oc get configmap cloud-config --namespace=kube-syste

This configuration is for block volumes, it is also supported to configure access to VSAN File volumes, and you can see an example of the configuration here;

vSphere CSI Configuration file for File Volumes

Remove the two local .conf files form your machine once the secret is created, as it contains your password in clear text for vCenter.

Installation

Install the vSphere CPI

Taint all OpenShift Nodes.

kubectl taint nodes --all 'node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule'

Install the vSphere CPI (RBAC, Bindings, DaemonSet)

oc apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/manifests/controller-manager/cloud-controller-manager-roles.yaml

oc apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/manifests/controller-manager/cloud-controller-manager-role-bindings.yaml

oc apply -f https://github.com/kubernetes/cloud-provider-vsphere/raw/master/manifests/controller-manager/vsphere-cloud-controller-manager-ds.yaml

You can verify the installation by viewing the providerID for the nodes, which must reference “vSphere”.

oc describe nodes | grep "ProviderID"

Install vSphere CSI driver

The driver is made up of the following components

CSI Controller runs as a Kubernetes deployment, with a replica count of 1.
For version v2.1.0, the vsphere-csi-controller Pod consists of 6 containers
- CSI controller, External Provisioner, External Attacher, External Resizer, Liveness probe and vSphere Syncer.

Note: This example shows the newer driver manifests for vSphere 7.0 U1. 
Use the correct vSphere version manifests as per this link.

Create the CSI artifacts.

oc apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/v2.1.0/vsphere-7.0u1/rbac/vsphere-csi-controller-rbac.yaml

oc apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/v2.1.0/vsphere-7.0u1/deploy/vsphere-csi-node-ds.yaml

oc apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/v2.1.0/vsphere-7.0u1/deploy/vsphere-csi-controller-deployment.yaml

Verify the deployment

You can verify the deployment with the two below commands

oc get deployments --namespace=kube-system

oc get CSINode

Creating a Storage Class that uses the CSI-Driver

Create a storage class to test the deployment. As I am using VMC as my test environment, I must use some additional optional parameters to ensure that I use the correct VSAN datastore (WorkloadDatastore). You can visit the references below for more information.

In the VMC vCenter UI, you can get this by going to the Datastore summary page.

To get my datastore URL I need to reference, I will use PowerCLI

get-datastore work* | Select -ExpandProperty ExtensionData | select -ExpandProperty Info

I’m going to create my StorageClass on the fly, but you can find my example YAMLs here;

https://github.com/saintdle/vSphere-CSI-Driver-2.0-OpenShift-4/tree/master/Example-SC+PVC

cat << EOF | oc apply -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: csi-sc-vmc
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: csi.vsphere.vmware.com
parameters:
  StoragePolicyName: "vSAN Default Storage Policy"
  datastoreURL: "ds:///vmfs/volumes/vsan:3672d400f5fa4515-8a8cb78f6b972f74/"
EOF

Create a Persistent Volume Claim

Finally, we are going to create a PVC. You can find my example PVC files at the same link above.

cat << EOF | oc apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: example-openshift-vmc-block-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: csi-sc-vmc
EOF

You can see the PVC created under my cluster > Monitor Tab > Cloud Native Storage in vCenter.

Using Labels

Thanks to one of my colleagues (Jason Monger), who asked me if we could use labels with this integration. And the answer is yes you can.

When creating your PVC, under metadata including your labels such as the able below. These will be pulled into your vCenter UI making it easier to associate your volumes.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: csi-pvc-test
  annotations:
    volume.beta.kubernetes.io/storage-class: csi-sc-vmc
  labels:
    appname: veducate
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi

Troubleshooting

For troubleshooting, you need to be aware of the four main containers that run in the vSphere CSI Controller pod and you should investigate the logs from these when you run into issues;

CSI-Attacher
CSI-Provisoner
vSphere-CSI-Controller
vSphere-Syncer

Below I have uploaded some of the logs from a successful setup and creation of a persistent volume.

https://github.com/saintdle/vSphere-CSI-Driver-2.0-OpenShift-4/tree/master/Example-Logs

Resources

Regards

Follow @Saintdle

Dean Lewis

How to deploy OpenShift 4.3 on VMware vSphere with Static IP addresses using Terraform

July 22, 2020Kubernetes, VMwareopenshift, Static IP, Terraform, VMware, vSphereDean

Install OpenShift 4.x on vSphere 6.x/7.x

The following procedure is intended to create VM’s from an OVA template booting with static IP’s when the DHCP server can not reserve the IP addresses.

The Problem

OCP requires that all DNS configurations be in place. VMware requires that the DHCP assign the correct IPs to the VM. Since many real installations require the coordination with different teams in an organization, many times we don’t have control of DNS, DHCP or Load balancer configurations.

The CoreOS documentation explain how to create configurations using ignition files. I created a python script to put the network configuration using the ignition files created by the openshift-install program.

Reference Architecture

For this guide, we are going to deploy 3 master nodes (control-plane) and 2 worker nodes (compute This guide uses RHEL CoreOS 4.3 as the virtual machine image, deploying Red Hat OCP 4.3, as per the support of N-1 from Red Hat.

We will use a centralised Linux server (Ubuntu) that will perform the following functions;

Load Balancer – HAProxy
Web Server – Apache2
Terraform automation host – version 0.11.14
- The deployment will be semi-automated using Terraform, so that we can easily build configuration files used by the CoreOS VM’s that have Static IP settings.
- Using a later version of Terraform will cause failures.
Client Tools for OpenShift deployment
- OC
- Kubectl
- Openshift-install

DNS will be provided by a Windows Server.

The installation will use a Bootstrap server to bring the cluster online, which will be removed at the end of the build process.

Deployment Steps

In this guide we will deploy our environment in the following order;

Configure DNS
Import Red Hat Core OS image into vCenter
Deploy Ubuntu Host
- Configure Apache
- Configure HAProxy
- Install Client-Tools
- Install Terraform
Build OpenShift Cluster configuration
Configuring the Terraform deployment
Running the Terraform deployment

DNS

Openshift uses a “clusterName.BaseDomain” format.

For example; I want to call my Openshift cluster Demo. And my DNS Domain is Simon.local, then my full format used by Openshift is “demo.simon.local”

Below is a table plan of the IP addresses you will use to build the environment.

The last three addresses are cluster level resources that are available on each control-plane node, accessible via the load balancer.

To configure the DNS records in Windows, you can use the Script and CSV file here

https://github.com/saintdle/OCP-4.3-vSphere-Static-IP/tree/master/DNS

In the below screenshot, the script has created the “demo” domain folder and entered my records. It is important that you have PTR records setup for everything apart from the “etcd-X” records.

Import Red Hat CoreOS Image into vCenter

Continue reading How to deploy OpenShift 4.3 on VMware vSphere with Static IP addresses using Terraform →

Kubernetes command line: tips and tricks

June 22, 2020Kubernetescritctl, default storage class, kubectl, Kubelet, Kubernetes, troubleshootDean

In this blog post, I have collected together a number of tips, tricks and snippets I’ve learned along the away whilst learning Kubernetes.

- Configure tab completion
- Selecting all namespaces in commands
- Restarting nodes
- Setting default storage class
- Resource usage
- Delete pods that are stuck terminating
- Using the watch command
- Troubleshooting
- - Run an interactive pod for debugging issues
- - - Alpine & BusyBox
- - Check etcd is running on master nodes
- - Get deployed pod image
- - Get Kubelet Service Logs
- - Get events from all namespaces, sorted by creation time
- Other Resources
- - Visual guide on troubleshooting Kubernetes deployments
- - Tool: Stern for tailing multiple Kubernetes objects logs
- - Useful Aliases to create for managing Kubernetes

I would also highly recommend the awesome Kubectl Cheat Sheet to be one of your go to references.

Configure Tab completion

source <(kubectl completion bash)

Selecting all name spaces in commands

rather than using “–all-namespaces” you can use “-A”

kubectl get pods --all-namespaces

kubectl get pods -A

Restarting Nodes

SSH to problematic node and run

/etc/init.d/kubelet restart

Source

Setting default storage class

Remove default storage class setting

kubectl patch storageclass {SC_NAME} -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

Configure storage class as default

kubectl patch storageclass {SC_NAME} -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Source

Resource Usage

Requires metrics-server to be installed and running (github)

Pods;

#Check what pods are using the most memory in the cluster:
kubectl top pod --all-namespaces  | sort -rnk4 | head -40
 
#Check what pods are using the most CPU in the cluster:
kubectl top pod --all-namespaces  | sort -rnk3 | head -80

Nodes;

#Check which nodes are using the most memory in the cluster:
kubectl top nodes --all-namespaces  | sort -rnk4 | head -40
 
#Check which nodes are using the most CPU in the cluster:
kubectl top nodes --all-namespaces  | sort -rnk3 | head -80

Verify Kubelet is exposing Node metrics;

kubectl get --raw /api/v1/nodes/{Node_Name}/proxy/stats/summary

To get kube-metrics working I had to add the following to the deployment. (Highlighted in bold).

kubectl edit deployment metrics-server -n kube-system
#############
name: metrics-server
spec:
containers:
- args:
 - --kubelet-preferred-address-types=InternalIP
 - --kubelet-insecure-tls

Delete pods that are stuck terminating

kubectl get pods --all-namespaces | grep Terminating | while read line; do pod_name=$(echo $line | awk '{print $2}') && name_space=$(echo $line | awk '{print $1}' ); kubectl delete pods $pod_name -n $name_space --grace-period=0 --force ; done

Using the Watch command

Really simple one, but when deploying things, sometimes you don’t the feedback you need from the system. However using the Linux watch command infront of your Kubernetes commands, you can;

watch -n 2 kubectl get pods -n {namespace}

In the above example, this command will refresh your page every 2 seconds and list out the available pods and status.

Troubleshooting:

Run an interactive pod for debugging

This will create a pod of one of the below images, which will be removed when you exit out of the session.

Apline;

kubectl run -i --rm -t alpine-$USER --image=alpine --restart=Never -- /bin/sh

Press enter

BusyBox

kubectl run -i --tty --rm debug --image=busybox --restart=Never -- sh

Press enter

Source

Check etcd is running on master nodes

Check etcd pods have been created by Kubelet

sudo crictl pods --name=etcd-member

or 

sudo crictl ps -A

Check etcd logs on master nodes

sudo crictl logs $(sudo crictl ps --pod=$(sudo crictl pods --name=etcd-member --quiet) --quiet)

Source

Get pod deployed image

Kubectl get pod {name} -n {namespace} -o "jsonpath={range .status.containerStatuses[*]}{.name}{'\t'}{.state}{'\t'}{.image}{'\n'}{end}"

Example: 

root@k8s-master# kubectl get pods nginx -o "jsonpath={range .status.containerStatuses[*]}{.name}{'\t'}{.state}{'\t'}{.image}{'\n'}{end}"

nginx map[running:map[startedAt:2020-06-10T15:44:40Z]] nginx:latest

Get Kubelet Service logs

SSH to your node and run the following

journalctl -f -u kubelet.service

Get events from all namespaces, sorted by creation time

kubectl get events -A  --sort-by='.metadata.creationTimestamp'

Other Resources

A visual guide on troubleshooting Kubernetes deployments

https://learnk8s.io/troubleshooting-deployments

Tool: Stern allows you to tail multiple pods on Kubernetes and multiple containers within the pod. Each result is colour coded for quicker debugging.

This can be more useful than the Kubectl logs command, which you need to know your individual pods name.

https://github.com/wercker/stern

Tail logs of all pods of the deployment/service
 CMD: stern -n {Namespace} {deployment}
 
Same as above but starting with logs in the last minute
 CMD: stern -n {Namespace} {deployment} -s 1m

Useful Alias, can be used without ZSH

https://github.com/ohmyzsh/ohmyzsh/blob/master/plugins/kubectl/kubectl.plugin.zsh

Regards

Follow @Saintdle

Dean Lewis

VMware Tanzu Mission Control – Getting started with your first cluster

May 26, 2020VMware, KubernetesAttach, Create, Kubernetes, Mission Control, monitoring, TanzuDean

In this blog post we will cover the following topics

- What is Tanzu Mission Control?
- So, this isn't just for VMware environments?
- Getting Started Tanzu Mission Control
- - TMC Resource Hierarchy
- - Creating a Cluster Group
- - Attaching a cluster to Tanzu Mission Control
- - Viewing your Cluster Objects
- - - Overview
- - - Nodes
- - - Namespaces
- - - Workloads
- Where can I demo/test/trial this myself?

The follow up blog posts are;

Tanzu Mission Control 
- Getting Started Tanzu Mission Control 
- Cluster Inspections 
- Workspaces and Policies  
- Data Protection 
- Deploying TKG clusters to AWS 
- Upgrading a provisioned cluster 
- Delete a provisioned cluster 
- TKG Management support and provisioning new clusters
- TMC REST API - Postman Collection
- Using custom policies to ensure Kasten protects a deployed application

What is Tanzu Mission Control?

Tanzu Mission control is a cloud offering, which gives you a single point of control, monitoring and management, regardless of the Kubernetes deployment and their location (e.g Tanzu Kubernetes Grid, OpenShift Container Platform, Azure Kubernetes to name but a few).

Key Capabilities;

Manage Kubernetes Cluster Lifecycle through the deployment and day 2 operations
Attach Clusters for centralized operations and management
Centralized policy management
- Apply access, network and container registry policies consistently across your Kubernetes clusters and namespaces
Global visibility for diagnosing and troubleshooting issues with your Kubernetes clusters
Inspection runbooks to validate the configuration of your clusters
- Current offerings are;
  - Conformance; validating binaries running in your cluster to ensure proper configuration and running.
  - CIS benchmark; evaluation against the CIS Benchmark for Kubernetes published by the Center for Internet Security.
  - Lite; node conformance test to validate your nodes meet the Kubernetes requirements.

So, this isn’t just for VMware environments?

Nope, this is a cloud and Kubernetes neutral offering. You can attach CNCF conformant Kubernetes clusters to Tanzu Mission Control no matter where they are running: on vSphere, in any public clouds, or through other Kubernetes vendors.

Getting Started Tanzu Mission Control

TMC Resource Hierarchy

In the Tanzu Mission Control resource hierarchy, there are three levels at which you can specify policies.

Organization
Object groups (Cluster groups and Workspaces)
Kubernetes objects (Clusters and Namespaces)

You can set direct policies for a given object, but each object can also inherit based on the parent objects. So pretty much what you’ve been used to in the past with policies and hierarchies.

Creating a Cluster Group

A Cluster Group is a logical object to bring together multiple Kubernetes clusters. You can set user access policies to be able to view/edit/control cluster group objects and their child objects (clusters).

Cluster groups provide an infrastructure view, and all clusters must be attached to a group.

To create a Cluster Group;

Select the Cluster Group from the navigation
Click New Cluster Group
Supply a name, description and labels are optional and can be edited after creation

Continue reading VMware Tanzu Mission Control – Getting started with your first cluster →

VMware Tanzu Mission Control – Workspaces and Policies

May 26, 2020VMware, KubernetesAccess, Image Registry, Kubernetes, Mission Control, Network, Policies, Security, Tanzu, WorkspaceDean

In this blog post we will cover the following topics

- Tanzu Mission Control 
- - Workspaces 
- - - Creating a workspace
- - - Creating a managed Namespace
- - - Viewing a managed Namespace
- - Policy Driven Cluster Management
- - - Creating a Image Registry Policy
- - - Creating a Network Policy

The follow up blog posts are;

Tanzu Mission Control 
- Getting Started Tanzu Mission Control 
- Cluster Inspections 
- Workspaces and Policies  
- Data Protection 
- Deploying TKG clusters to AWS 
- Upgrading a provisioned cluster 
- Delete a provisioned cluster 
- TKG Management support and provisioning new clusters
- TMC REST API - Postman Collection
- Using custom policies to ensure Kasten protects a deployed application

Workspaces

Workspaces provide an application view, where you logically group Kubernetes Namespaces together, regardless of the cluster to which they are attached.

This is in contrast to Cluster Groups, which are focused on the infrastructure view.

These Workspaces can be created to align to your projects or applications, from a hierarchy point of view, you would then authorize your users to these Workspaces, so that they can monitor and manage the namespaces related to their function.

Creating a Workspace

Click the Workspace navigation view on the left-hand side, and then New Workspace.

Specify your Workspace name, and provide the optional description and labels, these can be added after creation if needed.

Now you have a Workspace, it’s no good without any associated Namespaces, so let’s continue.

Creating a managed Namespace

All Namespaces attached to a Workspace will be managed Namespaces under TMC.

To create a managed Namespace, you can do this in one of four places;

Within the Workspace Navigation view
Inside the Workspace Object itself
On the Namespace Navigation view
On the Cluster Object > Navigation Tab

Continue reading VMware Tanzu Mission Control – Workspaces and Policies →