OpenShift

Using the vSphere CSI Driver with OpenShift 4.x and VSAN File Services

You may have seen my blog post “How to Install and configure vSphere CSI Driver on OpenShift 4.x“.

Here I updated the vSphere CSI driver to work the additional security constraints that are baked into OpenShift 4.x.

Since then, once of the things that has been on my list to test is file volumes backed by vSAN File shares. This feature is available in vSphere 7.0.

Well I’m glad to report it does in fact work, by using my CSI driver (see above blog or my github), you can simply deploy consume VSAN File services, as per the documentation here. 

I’ve updated my examples in my github repository to get this working.

OK just tell me what to do…

First and foremost, you need to add additional configuration to the csi conf file (csi-vsphere-for-ocp.conf).

If you do not, the defaults will be assumed which is full read-write access from any IP to the file shares created.

[Global]

# run the following on your OCP cluster to get the ID 
# oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}'
cluster-id = c6d41ba1-3b67-4ae4-ab1e-3cd2e730e1f2

[NetPermissions "A"]
ips = "*"
permissions = "READ_WRITE"
rootsquash = false

[VirtualCenter "10.198.17.253"]
insecure-flag = "true"
user = "[email protected]"
password = "Admin!23"
port = "443"
datacenters = "vSAN-DC"
targetvSANFileShareDatastoreURLs = "ds:///vmfs/volumes/vsan:52c229eaf3afcda6-7c4116754aded2de/"

Next, create a storage class which is configured to consume VSAN File services.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: file-services-sc
annotations:
storageclass.kubernetes.io/is-default-class: "false"
provisioner: csi.vsphere.vmware.com
parameters:
storagepolicyname: "vSAN Default Storage Policy" # Optional Parameter
csi.storage.k8s.io/fstype: "nfs4" # Optional Parameter

Then create a PVC to prove it works. Continue reading Using the vSphere CSI Driver with OpenShift 4.x and VSAN File Services

Veeam vRA Header

How to backup vRealize Automation 8.x using Veeam

In this blog post I am going to dissect backing up vRealize Automation 8.x using Veeam Backup and Replication.

- Understanding the backup methods
- Performing an online backup
- Performing an offline backup

Understanding the Backup Methods

Reading the VMware documentation around this subject can be somewhat confusing at times. And if you pay attention, there are subtle changes between the documents as well. Lets break this down.

  • vRealize Automation 8.0
    • As part of the backup job, you need to run a script to stop the services
    • This is known as an offline backup
    • Depending on your backup software, you can either do this by running a script located on the vRealize Automation appliance or by triggering using the pre-freeze/post-freeze scripts when a snapshot is taken of the VM.
    • The snapshot must not include the virtual machines memory.
    • If you environment is a cluster, you only need to run the script on a single node.
    • All nodes in the cluster must be backed up at the same time.
  • vRealize Automation 8.0.1 and 8.1 (and higher)
    • It is supported to run an online backup
      • No script is needed to shut down the services
    • Snapshot taken as part of the backup must quiesce the virtual machine.
    • The snapshot must not include the virtual machines memory.
    • It is recommended to run the script to stop all services and perform an offline backup.
      • You may also find your backup runs faster, as the virtual machine will become less busy.

Performing an Online Backup

Let’s start with the easier of the two options. Again, this will be supported for vRealize Automation 8.0.1 and higher. Continue reading How to backup vRealize Automation 8.x using Veeam

Veeam new logo featured image

Veeam – Script finished execution with unexpected exit code: 126

This Issue

When using Veeam if you use Pre/Post freeze scripts for application aware processing, these scripts must reside on the VBR server, which uploads to the virtual machine you are protecting.

I hit an issue when running a job that I received the error;

Error: Script finished execution with unexpected exit code: 126

Veeam Script finished execution with unexpected exit code 126

The Cause

I reviewed the task logs on the VBR server;

  • C:\ProgramData\Veeam\Backup\{Job_Name}\task_{VMName}.txt

You are looking for the sections [ScriptInvoker] for troubleshooting. And I found that my script was uploaded successfully, however when it was ran the Guest OS was throwing an error;

[ScriptInvoker] Failed to execute script in lin guest machine over SSH. Script path: C:\Backup\Scripts\stop-vra.sh.

bash: /tmp/d6791b89-e0b8-4cce-acec-45d682ce1f2c_stop-vra.sh: /bin/bash^M: bad interpreter: No such file or directory

I also connected to the linux machine and ran “journalctl -xe -f” whilst running the backup job and seen the same error flash up too.

The script indicates that it must be executed by a shell located at /bin/bash^M. 
There is no such file: it's called /bin/bash. The ^M is a carriage return character. 
Linux uses the line feed character to mark the end of a line, whereas Windows uses the two-character sequence CR LF. Your file has Windows line endings, which is confusing Linux. 
(Source)

You can see the full part of the task logs relating to this below.

[ScriptInvoker] Scripting mode is FailJobOnError.
[ScriptInvoker] Script enabled
[ScriptInvoker] Creating Linux invoker.
[ScriptInvoker] Starting pre-freeze script execution 
[ScriptInvoker] Running Linux script (SSH) 'C:\Backup\Scripts\stop-vra.sh'
[Ssh] SSH connection ae2f5df0-1d28-4dfc-a8a5-7ceb952af2a9 to server 192.168.200.39 created successfully
...
[ScriptInvoker] SSH connection is established (192.168.200.39).
[ScriptInvoker] Exception thrown during script execution (SSH).
[Ssh] Connection ae2f5df0-1d28-4dfc-a8a5-7ceb952af2a9 - [host: '192.168.200.39', port: 22, elevation to root: 'no', autoSudo: no, use su if sudo fails: no, host name: sc-dc1-vra001.simon.local, IPs: [192.168.200.39], AuthenticationData: [UserName: root, AuthTypes: [KeyboardInteractive, Password]]] is disposing.
[ScriptInvoker] Failed to execute script in lin guest machine over SSH. Script path: C:\Backup\Scripts\stop-vra.sh.
bash: /tmp/d6791b89-e0b8-4cce-acec-45d682ce1f2c_stop-vra.sh: /bin/bash^M: bad interpreter: No such file or directory
(System.Exception)
at Veeam.Backup.SSH.CSshCommandResult.GetAnswer(Boolean trimAnswer, String failText, Boolean checkStdErr)
at Veeam.Backup.Core.CSshScriptInvoker.RunScript(CSshScriptFile scriptFile, TimeSpan timeout, Boolean collectLogs, String stdOutFilePath, String stdErrFilePath, Boolean checkStdErr, Int32& exitCode)
at Veeam.Backup.Core.CSshScriptInvoker.ExecScriptInner(String localPath, TimeSpan timeout, Boolean collectLogs, String stdOutFilePath, String stdErrFilePath, Boolean checkStdErr, Int32& exitCode)

[ScriptInvoker] Failed to execute script over SSH, failing over to VIX.
bash: /tmp/d6791b89-e0b8-4cce-acec-45d682ce1f2c_stop-vra.sh: /bin/bash^M: bad interpreter: No such file or directory

...
[ScriptInvoker] Running Linux script (VIX) 'C:\Backup\Scripts\stop-vra.sh'
[ScriptInvoker] Linux script exit code = '126'

The Fix

The fix is quite easy, either change your text editor to show the correct line endings.

Or create the script on a Linux machine and copy it to your VBR Server.

To fix this issue in your text editor

Notepad++ 
Edit --> EOL Conversion --> UNIX/OSX Format

Eclipse  
File > Convert Line Delimiters To > Unix (LF, \n, 0Α, ¶)

Or change the New text file line delimiter to Other: Unix on Window > Preferences > General > Workspace

Sublime Text Editor
View > Line Endings > Unix 

Atom
See this guide. 

(Source)

 

Regards

OpenShift

How to Install and configure vSphere CSI Driver on OpenShift 4.x

Note2: December 2021 VMware released the Red Hat Certified Operator "vSphere Kubernetes Driver Operator", which is now the preferred and recommended way to install CPI and CSI in your OpenShift environment.
- Using the new vSphere Kubernetes Driver Operator with Red Hat OpenShift via Operator Hub

Note: This blog post was updated in February 2021 to use the new driver manifests from the Official VMware CSI Driver repository, which now provides support for OpenShift

Introduction

In this post I am going to install the vSphere CSI Driver version 2.1.0 with OpenShift 4.x, in my demo environment I’m connecting to a VMware Cloud on AWS SDDC and vCenter, however the steps are the same for an on-prem deployment.

We will be using the vSphere CSI Driver which now supports OpenShift.

- Pre-Reqs
- - vCenter Server Role
- - Download the deployment files
- - Create the vSphere CSI secret in OpenShift
- - Create Roles, ServiceAccount and ClusterRoleBinding for vSphere CSI Driver
- Installation
- - Install vSphere CSI driver
- - Verify Deployment
- Create a persistent volume claim
- Using Labels
- Troubleshooting

In your environment, cluster VMs will need “disk.enableUUID” and VM hardware version 15 or higher.

Pre-Reqs
vCenter Server Role

In my environment I will use the default administrator account, however in production environments I recommend you follow a strict RBAC procedure and configure the necessary roles and use a dedicated account for the CSI driver to connect to your vCenter.

To make life easier I have created a PowerCLI script to create the necessary roles in vCenter based on the vSphere CSI documentation;

Download the deployment files

Run the following;

git clone https://github.com/saintdle/vSphere-CSI-Driver-2.0-OpenShift-4.git

vSphere CSI OpenShift git clone

Create the vSphere CSI Secret + CPI ConfigMap in OpenShift

Edit the two files “csi-vsphere.conf” + “vsphere.conf” with your vCenter infrastructure details. These two files may have the same information in them, but in the example of using VSAN File Services, then you may include further configuration in your CSI conf file, as an example.

[Global]
 
# run the following on your OCP cluster to get the ID
# oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}'
#Your OCP cluster name provided below can just be a human readable name but needs to be unique when running different OCP clusters on the same vSphere environment.
cluster-id = "OCP_CLUSTER_ID"

[VirtualCenter "VC_FQDN"]
insecure-flag = "true"
user = "USER"
password = "PASSWORD"
port = "443"
datacenters = "VC_DATACENTER"

vSphere CSI with Openshift configure vSphere Secret in OpenShift

Create the CSI secret + CPI configmap;

oc create secret generic vsphere-config-secret --from-file=csi-vsphere.conf --namespace=kube-system

oc create configmap cloud-config --from-file=vsphere.conf --namespace=kube-system

To validate:
oc get secret vsphere-config-secret --namespace=kube-system
oc get configmap cloud-config --namespace=kube-syste

This configuration is for block volumes, it is also supported to configure access to VSAN File volumes, and you can see an example of the configuration here;

Remove the two local .conf files form your machine once the secret is created, as it contains your password in clear text for vCenter.

Installation
Install the vSphere CPI

Taint all OpenShift Nodes.

kubectl taint nodes --all 'node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule'

Install the vSphere CPI (RBAC, Bindings, DaemonSet)

oc apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/manifests/controller-manager/cloud-controller-manager-roles.yaml

oc apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/manifests/controller-manager/cloud-controller-manager-role-bindings.yaml

oc apply -f https://github.com/kubernetes/cloud-provider-vsphere/raw/master/manifests/controller-manager/vsphere-cloud-controller-manager-ds.yaml

You can verify the installation by viewing the providerID for the nodes, which must reference “vSphere”.

oc describe nodes | grep "ProviderID"

vSphere CSI CPI OpenShift ProviderID

Install vSphere CSI driver

The driver is made up of the following components

  • CSI Controller runs as a Kubernetes deployment, with a replica count of 1.
  • For version v2.1.0, the vsphere-csi-controller Pod consists of 6 containers
    • CSI controller, External Provisioner, External Attacher, External Resizer, Liveness probe and vSphere Syncer.
Note: This example shows the newer driver manifests for vSphere 7.0 U1. 
Use the correct vSphere version manifests as per this link.

Create the CSI artifacts.

oc apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/v2.1.0/vsphere-7.0u1/rbac/vsphere-csi-controller-rbac.yaml

oc apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/v2.1.0/vsphere-7.0u1/deploy/vsphere-csi-node-ds.yaml

oc apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/v2.1.0/vsphere-7.0u1/deploy/vsphere-csi-controller-deployment.yaml
Verify the deployment

You can verify the deployment with the two below commands

oc get deployments --namespace=kube-system

oc get CSINode
vSphere CSI oc get deployments oc get CSInode
Creating a Storage Class that uses the CSI-Driver

Create a storage class to test the deployment. As I am using VMC as my test environment, I must use some additional optional parameters to ensure that I use the correct VSAN datastore (WorkloadDatastore). You can visit the references below for more information.

In the VMC vCenter UI, you can get this by going to the Datastore summary page.

VMC get WorkloadDatastore VSAN URL

To get my datastore URL I need to reference, I will use PowerCLI

get-datastore work* | Select -ExpandProperty ExtensionData | select -ExpandProperty Info
vSphere CSI with Openshift Get VMC Datastore URL

I’m going to create my StorageClass on the fly, but you can find my example YAMLs here;

cat << EOF | oc apply -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: csi-sc-vmc
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: csi.vsphere.vmware.com
parameters:
  StoragePolicyName: "vSAN Default Storage Policy"
  datastoreURL: "ds:///vmfs/volumes/vsan:3672d400f5fa4515-8a8cb78f6b972f74/"
EOF
vSphere CSI with Openshift Create StorageClass
Create a Persistent Volume Claim

Finally, we are going to create a PVC. You can find my example PVC files at the same link above.

cat << EOF | oc apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: example-openshift-vmc-block-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: csi-sc-vmc
EOF
vSphere CSI with Openshift PVC Created

You can see the PVC created under my cluster > Monitor Tab > Cloud Native Storage in vCenter.

vSphere CSI with Openshift PVC in vCenter Console

Using Labels

Thanks to one of my colleagues (Jason Monger), who asked me if we could use labels with this integration. And the answer is yes you can.

When creating your PVC, under metadata including your labels such as the able below. These will be pulled into your vCenter UI making it easier to associate your volumes.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: csi-pvc-test
  annotations:
    volume.beta.kubernetes.io/storage-class: csi-sc-vmc
  labels:
    appname: veducate
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
vSphere CSI with Openshift PVC Labels
Troubleshooting

For troubleshooting, you need to be aware of the four main containers that run in the vSphere CSI Controller pod and you should investigate the logs from these when you run into issues;

  • CSI-Attacher
  • CSI-Provisoner
  • vSphere-CSI-Controller
  • vSphere-Syncer

Below I have uploaded some of the logs from a successful setup and creation of a persistent volume.

Resources

Regards

OpenShift

How to deploy OpenShift 4.3 on VMware vSphere with Static IP addresses using Terraform

Install OpenShift 4.x on vSphere 6.x/7.x

The following procedure is intended to create VM’s from an OVA template booting with static IP’s when the DHCP server can not reserve the IP addresses.

The Problem

OCP requires that all DNS configurations be in place. VMware requires that the DHCP assign the correct IPs to the VM. Since many real installations require the coordination with different teams in an organization, many times we don’t have control of DNS, DHCP or Load balancer configurations.

The CoreOS documentation explain how to create configurations using ignition files. I created a python script to put the network configuration using the ignition files created by the openshift-install program.

Reference Architecture

For this guide, we are going to deploy 3 master nodes (control-plane) and 2 worker nodes (compute This guide uses RHEL CoreOS 4.3 as the virtual machine image, deploying Red Hat OCP 4.3, as per the support of N-1 from Red Hat.

We will use a centralised Linux server (Ubuntu) that will perform the following functions;

  • Load Balancer – HAProxy
  • Web Server – Apache2
  • Terraform automation host – version 0.11.14
    • The deployment will be semi-automated using Terraform, so that we can easily build configuration files used by the CoreOS VM’s that have Static IP settings.
    • Using a later version of Terraform will cause failures.
  • Client Tools for OpenShift deployment
    • OC
    • Kubectl
    • Openshift-install

DNS will be provided by a Windows Server.

The installation will use a Bootstrap server to bring the cluster online, which will be removed at the end of the build process.

OpenShift Deployment Arch Diagram

Deployment Steps

In this guide we will deploy our environment in the following order;

  • Configure DNS
  • Import Red Hat Core OS image into vCenter
  • Deploy Ubuntu Host
    • Configure Apache
    • Configure HAProxy
    • Install Client-Tools
    • Install Terraform
  • Build OpenShift Cluster configuration
  • Configuring the Terraform deployment
  • Running the Terraform deployment
DNS

Openshift uses a “clusterName.BaseDomain” format.

For example; I want to call my Openshift cluster Demo. And my DNS Domain is Simon.local, then my full format used by Openshift is “demo.simon.local”

Below is a table plan of the IP addresses you will use to build the environment.

The last three addresses are cluster level resources that are available on each control-plane node, accessible via the load balancer.

To configure the DNS records in Windows, you can use the Script and CSV file here

Deploy OpenShift VMware Static IP PowerShell Configure DNS Records

In the below screenshot, the script has created the “demo” domain folder and entered my records. It is important that you have PTR records setup for everything apart from the “etcd-X” records.

Deploy OpenShift VMware Static IP DNS Records Deploy OpenShift VMware Static IP DNS Records 2 Deploy OpenShift VMware Static IP DNS Records 3 Deploy OpenShift VMware Static IP Configure Reverse DNS Records

Import Red Hat CoreOS Image into vCenter

Continue reading How to deploy OpenShift 4.3 on VMware vSphere with Static IP addresses using Terraform