Working with Red Hat OpenShift on vSphere, I’m really starting to understand the main infrastructure components and how everything fits together.
Next up was understanding how to control the cluster size after initial deployment. So, with Red Hat OpenShift, there are some basic concepts we need to understand first, before we jump into the technical how-to’s below in this blog.
In this blog I will cover the following;
- Understanding the concepts behind controlling Machines in OpenShift - Editing your MachineSet to control your Virtual Machine Resources - Editing your MachineSet to scale your cluster manually - Deleting a node - Configuring ClusterAutoscaler to automatically scale your environment
Machine API
The Machine API is a combination of primary resources that are based on the upstream Cluster API project and custom OpenShift Container Platform resources.
The Machine API performs all node host provisioning management actions as a post cluster installation method, providing you dynamic provisioning on top of your VMware vSphere platform (and other public/private cloud platforms).
The two primary resources are:
- Machines
- An object that describes the host for a Node. A machine has a providerSpec, which describes the types of compute nodes that are offered for different cloud platforms. For example, a machine type for a worker node on Amazon Web Services (AWS) might define a specific machine type and required metadata.
- MachineSets
- Groups of machines. MachineSets are to machines as ReplicaSets are to Pods. If you need more machines or must scale them down, you change the replicas field on the MachineSet to meet your compute need.
These custom resources add capabilities to your OpenShift cluster:
- MachineAutoscaler
- This resource automatically scales machines in a cloud. You can set the minimum and maximum scaling boundaries for nodes in a specified MachineSet, and the MachineAutoscaler maintains that range of nodes. The MachineAutoscaler object takes effect after a ClusterAutoscaler object exists. Both ClusterAutoscaler and MachineAutoscaler resources are made available by the ClusterAutoscalerOperator.
- ClusterAutoscaler
- This resource is based on the upstream ClusterAutoscaler project. In the OpenShift Container Platform implementation, this is integrated with the Machine API by extending the MachineSet API. You can set cluster-wide scaling limits for resources such as cores, nodes, memory, GPU, etc. You can configure priorities so that the cluster prioritizes pods so that new nodes are not brought online for less important pods. You can also set the ScalingPolicy, so that for example, you can scale up nodes but not scale down the node count.
MachineHealthCheck
- This resource detects when a machine is unhealthy, deletes it, and, on supported platforms, creates a new machine. You can read more here about this technology preview feature in OCP 4.6.
Editing your MachineSet to control your Virtual Machine Resources
To view the current MachineSet objects available run;
oc get MachineSet -n openshift-machine-api
You can view your MachineSet configuration with the following command;
oc describe machineset {machineset_name} -n openshift-machine-api
This provides us a view of our current MachineSet, created during the Cluster Bring up. We can edit this existing set, or you can create a new MachineSet.
To edit the MachineSet use the below command, and then the typical VI commands for editing and saving;
oc edit machineset {machineset_name} -n openshift-machine-api
Within your MachineSet, I just want to call out a few Items you will be interested in, these are under the “Spec” section;
1. Disk Size - this is for the standard VMDK of the Node, the minimum recommended is 120GB. Remember you'll setup your environment to use different storage/VMDKs for your containers. 2. Memory - of your node, the default is 8GB for worker nodes 3. The network label your virtual machines will be attached to 4. Number of CPUs and Cores Per Socket. Default is 2 CPUs, 1 Per Socket. 5. Which virtual machine in your environment is the template to use in a clone operation in vSphere when scaling up your cluster 6. The details of your vCenter environment that you are deploying to. The default will be the details provided when building your cluster.
One of the possibilities I’d like to call out here is the ability to scale up rather than out. That’s increase the resources per node, rather than add extra nodes. To do this, you would edit the MachineSet with your preferred resource settings, then delete the nodes (see below) that are part of the MachineSet one by one. The system will replace the delete nodes that are part of the MachineSet
Editing your MachineSet to scale up your cluster size
Ok, so you’ve two ways to scale up, you can edit the replicas number within the MachineSet.
Or the easier way (if you ask me) is using the “Scale” command, you can scale up or down the workers in the cluster.
- Scaling Up will create new virtual machines by cloning the template listed in the MachineSet and configuring it with the compute and storage resources specified.
- Scaling Down will delete a random worker virtual machine (default behavior, you can learn how to alter here). This action will drain your node of running pods before destroying the virtual machine.
oc scale --replicas={No.} machineset {MachineSet_Name} -n openshift-machine-api
Below you can see the tasks for creating the new machine in the vSphere Console.
Looking at the virtual machine that has been created, this has the resources configured in the machine set.
Note: if you are using the vSphere CSI Driver with OpenShift, you need VMX-15 or above. Set this on the virtual machine template used in the machine sets.
Deleting a node
You can remove a node by running the below command, OpenShift will attempt to drain the node of running pods first. If the machine is part of a MachineSet (as per the above), then the node will be recreated.
oc delete machine <machine> -n openshift-machine-api
Configuring AutoScaling to automatically scale your environment
The final piece of all this is the auto scale feature, which in Openshift 4.6, now supports vSphere.
As per the name, this feature automatically scales up and down your MachineSets based on the parameters you provide. To provide this, you create two resource definitions;
- ClusterAutoscaler: this adjusts the size of an OpenShift Container Platform cluster to meet its current deployment needs. The ClusterAutoscaler has a cluster scope, and is not associated with a particular namespace, it increases the size of the cluster when there are Pods that failed to schedule on any of the current nodes due to insufficient resources or when another node is necessary to meet deployment needs. The ClusterAutoscaler does not increase the cluster resources beyond the limits that you specify.
- MachineAutoscaler: adjusts the number of Machines in the MachineSets attached to your cluster. You can scale both the default worker MachineSet and any other MachineSets that you create. The MachineAutoscaler creates more Machines when the cluster runs out of resources to support more deployments. Any changes to the values in MachineAutoscaler resources, such as the minimum or maximum number of instances, are immediately applied to the MachineSet they target.
Deploy an ClusterAutoscaler, create a YAML file with the following settings;
apiVersion: "autoscaling.openshift.io/v1" kind: "ClusterAutoscaler" metadata: name: "default" spec: podPriorityThreshold: -10 resourceLimits: maxNodesTotal: 24 cores: min: 8 max: 128 memory: min: 4 max: 256 scaleDown: enabled: true delayAfterAdd: 10m delayAfterDelete: 5m delayAfterFailure: 30s unneededTime: 60s
The settings are pretty self-explanatory, but you can find a full breakdown here. This is setting the values for you whole cluster. Create the resource with the following;
oc create -f ClusterAutoscaler.yaml
Secondly, you need to create the MachineAutoScaler YAML file, which defines which MachineSet to scale, and the limits.
apiVersion: "autoscaling.openshift.io/v1beta1" kind: "MachineAutoscaler" metadata: name: "{Name_of_MachineAutoscaler"} namespace: "openshift-machine-api" spec: minReplicas: 1 maxReplicas: 12 scaleTargetRef: apiVersion: machine.openshift.io/v1beta1 kind: MachineSet name: {MachineSet_Name}
Then create the resource from the YAML file.
oc create -f MachineAutoscaler.yaml
You can extend this even further in terms of control using pod priority to control scheduling decisions, which you can read up on here.
Regards