OpenShift

Red Hat OpenShift on VMware vSphere – How to Scale and Edit your cluster deployments

Working with Red Hat OpenShift on vSphere, I’m really starting to understand the main infrastructure components and how everything fits together.

Next up was understanding how to control the cluster size after initial deployment. So, with Red Hat OpenShift, there are some basic concepts we need to understand first, before we jump into the technical how-to’s below in this blog.

In this blog I will cover the following;

- Understanding the concepts behind controlling Machines in OpenShift
- Editing your MachineSet to control your Virtual Machine Resources
- Editing your MachineSet to scale your cluster manually
- Deleting a node
- Configuring ClusterAutoscaler to automatically scale your environment

Machine API

The Machine API is a combination of primary resources that are based on the upstream Cluster API project and custom OpenShift Container Platform resources.

The Machine API performs all node host provisioning management actions as a post cluster installation method, providing you dynamic provisioning on top of your VMware vSphere platform (and other public/private cloud platforms).

The two primary resources are:

Machines
An object that describes the host for a Node. A machine has a providerSpec, which describes the types of compute nodes that are offered for different cloud platforms. For example, a machine type for a worker node on Amazon Web Services (AWS) might define a specific machine type and required metadata.
MachineSets
Groups of machines. MachineSets are to machines as ReplicaSets are to Pods. If you need more machines or must scale them down, you change the replicas field on the MachineSet to meet your compute need.

These custom resources add capabilities to your OpenShift cluster:

MachineAutoscaler
This resource automatically scales machines in a cloud. You can set the minimum and maximum scaling boundaries for nodes in a specified MachineSet, and the MachineAutoscaler maintains that range of nodes. The MachineAutoscaler object takes effect after a ClusterAutoscaler object exists. Both ClusterAutoscaler and MachineAutoscaler resources are made available by the ClusterAutoscalerOperator.
ClusterAutoscaler
This resource is based on the upstream ClusterAutoscaler project. In the OpenShift Container Platform implementation, this is integrated with the Machine API by extending the MachineSet API. You can set cluster-wide scaling limits for resources such as cores, nodes, memory, GPU, etc. You can configure priorities so that the cluster prioritizes pods so that new nodes are not brought online for less important pods. You can also set the ScalingPolicy, so that for example, you can scale up nodes but not scale down the node count.

MachineHealthCheck

This resource detects when a machine is unhealthy, deletes it, and, on supported platforms, creates a new machine. You can read more here about this technology preview feature in OCP 4.6.

Editing your MachineSet to control your Virtual Machine Resources

To view the current MachineSet objects available run;

oc get MachineSet -n openshift-machine-api

oc get MachineSet -n openshift-machine-api

You can view your MachineSet configuration with the following command;

oc describe machineset {machineset_name} -n openshift-machine-api

This provides us a view of our current MachineSet, created during the Cluster Bring up. We can edit this existing set, or you can create a new MachineSet.

oc describe machineset machineset_name -n openshift-machine-api

To edit the MachineSet use the below command, and then the typical VI commands for editing and saving;

oc edit machineset {machineset_name} -n openshift-machine-api

Within your MachineSet, I just want to call out a few Items you will be interested in, these are under the “Spec” section;

OpenShift MachineSet Spec

1. Disk Size - this is for the standard VMDK of the Node, the minimum recommended is 120GB. Remember you'll setup your environment to use different storage/VMDKs for your containers.
2. Memory - of your node, the default is 8GB for worker nodes
3. The network label your virtual machines will be attached to
4. Number of CPUs and Cores Per Socket. Default is 2 CPUs, 1 Per Socket. 
5. Which virtual machine in your environment is the template to use in a clone operation in vSphere when scaling up your cluster
6. The details of your vCenter environment that you are deploying to. The default will be the details provided when building your cluster.

One of the possibilities I’d like to call out here is the ability to scale up rather than out. That’s increase the resources per node, rather than add extra nodes. To do this, you would edit the MachineSet with your preferred resource settings, then delete the nodes (see below) that are part of the MachineSet one by one. The system will replace the delete nodes that are part of the MachineSet

Editing your MachineSet to scale up your cluster size

Ok, so you’ve two ways to scale up, you can edit the replicas number within the MachineSet.

oc edit machineset replicas

Or the easier way (if you ask me) is using the “Scale” command, you can scale up or down the workers in the cluster.

  • Scaling Up will create new virtual machines by cloning the template listed in the MachineSet and configuring it with the compute and storage resources specified.
  • Scaling Down will delete a random worker virtual machine (default behavior, you can learn how to alter here). This action will drain your node of running pods before destroying the virtual machine.
oc scale --replicas={No.} machineset {MachineSet_Name} -n openshift-machine-api

oc scale machineset

Below you can see the tasks for creating the new machine in the vSphere Console.

oc scale machineset vsphere console

Looking at the virtual machine that has been created, this has the resources configured in the machine set.

Note: if you are using the vSphere CSI Driver with OpenShift, you need VMX-15 or above. Set this on the virtual machine template used in the machine sets.

oc scale machineset new virtual machine

Deleting a node

You can remove a node by running the below command, OpenShift will attempt to drain the node of running pods first. If the machine is part of a MachineSet (as per the above), then the node will be recreated.

oc delete machine <machine> -n openshift-machine-api

Configuring AutoScaling to automatically scale your environment

The final piece of all this is the auto scale feature, which in Openshift 4.6, now supports vSphere.

As per the name, this feature automatically scales up and down your MachineSets based on the parameters you provide. To provide this, you create two resource definitions;

  • ClusterAutoscaler: this adjusts the size of an OpenShift Container Platform cluster to meet its current deployment needs. The ClusterAutoscaler has a cluster scope, and is not associated with a particular namespace, it increases the size of the cluster when there are Pods that failed to schedule on any of the current nodes due to insufficient resources or when another node is necessary to meet deployment needs. The ClusterAutoscaler does not increase the cluster resources beyond the limits that you specify.
  • MachineAutoscaler: adjusts the number of Machines in the MachineSets attached to your cluster. You can scale both the default worker MachineSet and any other MachineSets that you create. The MachineAutoscaler creates more Machines when the cluster runs out of resources to support more deployments. Any changes to the values in MachineAutoscaler resources, such as the minimum or maximum number of instances, are immediately applied to the MachineSet they target.

Deploy an ClusterAutoscaler, create a YAML file with the following settings;

apiVersion: "autoscaling.openshift.io/v1"
kind: "ClusterAutoscaler"
metadata:
  name: "default"
spec:
  podPriorityThreshold: -10 
  resourceLimits:
    maxNodesTotal: 24 
    cores:
      min: 8 
      max: 128 
    memory:
      min: 4 
      max: 256 
  scaleDown: 
    enabled: true 
    delayAfterAdd: 10m 
    delayAfterDelete: 5m 
    delayAfterFailure: 30s 
    unneededTime: 60s 

The settings are pretty self-explanatory, but you can find a full breakdown here. This is setting the values for you whole cluster. Create the resource with the following;

oc create -f ClusterAutoscaler.yaml

Secondly, you need to create the MachineAutoScaler YAML file, which defines which MachineSet to scale, and the limits.

apiVersion: "autoscaling.openshift.io/v1beta1"
kind: "MachineAutoscaler"
metadata:
  name: "{Name_of_MachineAutoscaler"}
  namespace: "openshift-machine-api"
spec:
  minReplicas: 1 
  maxReplicas: 12 
  scaleTargetRef: 
    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet 
    name: {MachineSet_Name}

Then create the resource from the YAML file.

oc create -f MachineAutoscaler.yaml

You can extend this even further in terms of control using pod priority to control scheduling decisions, which you can read up on here.

Regards

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.