Tanzu Blog Logo Header

vSphere with Tanzu – Can I disable DRS?

Can I disable DRS?

No.

Why can’t I disable DRS when Workload Management is enabled?

DRS is a mandatory feature for workload management, the WCP service relies on objects such as Resource Pools to operate.

  • Update – 29th October

The vSphere with Tanzu Documentation has now been updated with this statement.

Caution: Do not disable vSphere DRS after you configure the Supervisor Cluster. Having DRS enabled at all times is a mandatory prerequisite for running workloads on the Supervisor Cluster. Disabling DRS leads to breaking your Tanzu Kubernetes clusters.
What happens if I attempt to disable DRS?

If you disable DRS in a cluster where Workload Management is enabled you will be presented the following message.

The key part of the message below is “the cluster will enter an unrecoverable state.”

The system will let you proceed past this message and disable DRS. DON’T DO IT!

wcp - disable drs message

What if I need to stop VM’s being vMotioned in my cluster?

Keep DRS enabled, and set the DRS mode to manual or Partially Automated.

wcp - drs mode

I really need to disable DRS, what do I do?

Ring VMware Support and discuss with them your need and the situation you find yourself in.

How do I stop my admins accidentality disabling DRS?

This KB article may help, as well as setting appropriate RBAC permissions for anyone accessing your vCenter rather than giving them full administrator rights so they can change settings they shouldn’t.

If you are unsure about any of this, contact VMware Support.

Do you have a fantastic meme to end this blog post with?

Yes.

just because you can doesn't mean you should

Regards

Dean Lewis

vRLI Header

Using vRealize Log Insight Cloud to archive on-premise Log Insight Data

vRealize Log Insight 8.6 brings the ability to build a hybrid log management platform, utilizing the functionality of an on-premises deployment of vRLI and vRLI Cloud.

From the release notes, in this blog post we’ll be looking at how to configure the following:

  • Simplify Log Archival with Non-Indexed Partitions: Use vRealize Log Insight Cloud to archive logs to meet your long-term retention requirements. vRealize Log Insight Cloud provides a no-limit logging solution at a low cost and eliminates any storage management overheads of the past. This enables easy accessibility to archived logs through on-demand queries.

For this, you will need access to a vRealize Log Insight Cloud Instance, with a cloud proxy deployed to your environment that can be accessed by the on-premises vRealize Log Insight platform.

The expectation is that you would forward you vRealize Log Insight on-premises logs to the vRealize Log Insight Cloud instance storing them only in a Non-Indexed Partition (discussed below). As your on-premises deployment act as your easy to analyse near time (within 30 days) copy of your logs. 

In this blog post I also explore the configuration and use of Index Partitions which essentially offers that near time usability and analysing of logs as well.

The high-level steps for the configuration discussed in this blog post are:

  • Send infrastructure or application logs to your on-premises vRealize Log Insight deployment
  • Setup the cloud proxy (if not already done)
  • Setup log forwarding from the on-premises Log Insight instance
  • In vRealize Log Insight Cloud, configure Non-Index Partition to receive the forwarded logs
What are Log Partitions?

Log Partitions are a feature that allows you to ingest logs based on user-defined filters. This feature is available as a paid subscription (or Trial).

There are two types of log Log Partitions:

  • Indexed Partitions
    • Stores logs for up to 30 days
    • Billed only for volume of logs ingested into the partition
    • Search and analyse logs in this partition without additional costs
  • Non-Indexed Partitions
    • Stores logs for up to 7 years
    • Billed for the volume of logs ingested into the partition, and for searching the logs.
    • If you need to query logs frequently, you can move logs to a recall partition for 30 days.
      • No additional cost for searching and analysing logs in the recall partition

Logs that do not match a query criteria in any of the configured partitions, will be stored in the Default Indexed Partition. This is read only and stores logs for 30 days.

Note:  

- Alerts and dashboard widgets are not operational in non-indexed partitions.
- Log partitions store logs ingested in the last 24 hours only.
- You can create a maximum of 10 log partitions in an organization.
Video Walk-through

Example Logs

In my Log Insight environment, I have setup the FluentD configuration to forward the Tanzu Kubernetes Grid logs from two clusters to vRealize Log Insight (on-premises deployment).

You can find the configuration settings for this within vRealize Log Insight, under the Sources Tab > Containers > Tanzu Kubernetes Grid.

vRLI Log Archive - Configure Fluentd for Tanzu Kubernetes Grid

vRLI Log Archive - Tanzu Kubernetes Grid Logs

Setup the Cloud Proxy

Continue reading Using vRealize Log Insight Cloud to archive on-premise Log Insight Data

Kasten K10 Header

Kasten K10 – Air gap installation using Harbor Image Registry

In this blog post, I will cover the steps for an air-gap installation for Kasten K10. For situations where your Kubernetes cluster doesn’t have available internet access to pull down the container images directly from their online locations.

Pre-requisites
  • Image Registry that is accessible by your Kubernetes cluster
  • Client that has access to download the container images and then to the Image Registry
    • In this example, I am using my local machine which has docker installed.
  • Helm downloaded
    • Run the following to get the helm files locally for the install.
helm repo update && \
    helm fetch kasten/k10 --version=<k10-version>

Example for Kasten K10 4.5.0

helm repo update && \ 
    helm fetch kasten/k10 --version=4.5.0

This will download a file, for example "k10-4.5.0.tgz"
Log into your Image Registry

First you need to ensure that your docker client (or similar) has authenticated to your Image Registry which your air-gap Kubernetes cluster can access.

When using Harbor and Docker, I typically use this method with a robot account for programmatic access.

However, when running the Kasten tooling which we’ll discuss next, I kept hitting an error. Continue reading Kasten K10 – Air gap installation using Harbor Image Registry

vRealize Automation vRealize Orchestrator Dynamic Types Header

How to create vRO Dynamic Types for vRA Custom Resources

This follow on blog post, diving into how we created the vRA integration with DMS comes from Katherine Skilling, who kindly offered to guest spot and provide the additional content regarding the work we have done internally. You can find her details at the end of this blog post.

In an earlier blog post Dean covered the use of vRA (vRealize Automation) Custom Resources in the context of using vRA to create Databases in DMS (Data Management for VMware Tanzu) and how to create custom day 2 actions. In this post, we will look at how we created the Dynamic Types in vRO (vRealize Orchestrator) to facilitate the creation of the custom resources in vRA.

Introduction – What are Dynamic Types?

Dynamic Types are custom objects in vRO created to extend the schema so that you can create and manage 3rd party objects within vRO. Each type has a definition that contains the object’s properties as well as its relationship within the overall namespace which is the top level in the Dynamic Types hierarchy.

As we started working on our use case, we looked at a tool (published on VMware Code) that would generate Dynamic Types based on an API Swagger specification. The problem we encountered was the tool was quite complex and our API Swagger for Data Management for VMware Tanzu (DMS) didn’t seem to quite fit with the expected format.

This meant we ended up with lots of orphaned entries after running the tool and hoping it would do all the heavy lifting for us. After spending some time investigating and troubleshooting it become clear we didn’t understand Dynamic Types, and how they are created sufficiently well enough, to be able to resolve all our issues. Instead, we decided to scale back on our plans and focus on just the database object we really needed initially. We could use it as a learning exercise, and then revisit the generator tool later once we had a more solid foundation.

To get a better understanding of how Dynamic Types work I recommend this blog from Mike Bombard. He walks through a theoretical example using a zoo and animals to show you how objects are related, as well as how to create the required workflows. I like this particular blog as you don’t need to consider how you are going to get values from a 3rd party system, so its easy to follow along and see the places where you would be making an external connection to retrieve data. It also helped me to understand the relationships between objects without getting mixed up in the properties provided by technical objects.

After reading Mike’s post I realised that we only had a single object for our use case, a database within DMS. We didn’t have any other objects related to it, it didn’t have a parent object and it didn’t have any children. So, when we created a Dynamic Type we would need to generate a placeholder object to act as the parent for the database. I choose to name this databaseFolder just for simplicity and because I’m a visual person and like to organise things inside folders. These databaseFolders would not exist in DMS, they are just an object I created within vRO, they have no real purpose or properties to them other than that the DMS databases are their children in the Dynamic Types inventory.

Stub Workflows

When you define a new Dynamic Type, you must create or associate four workflows to it, which are known as stubs:

  1. Find By Id
  2. Find All
  3. Has Children in Relation
  4. Find Relation

These workflows tell vRO how it can find the Dynamic Type and what its position is in the hierarchy in relation to other types. You can create one set of workflows to share across all Types or you can create a set of workflows per Type. For our use case we only needed one set of the workflows, so we created our code such that the workflows would be dedicated to just the database and databaseFolder objects.

It’s important to know that vRO will run these workflows automatically when administrators browse the vRO inventory, or when using Custom Resources within vRA. They are not started manually by administrators, if you do test them by running them manually you may struggle to populate the input values correctly.

I’ll give you a bit of background to the different workflows next.

Find By Id Workflow

This workflow is automatically run whenever vRO needs to locate a particular instance of a Dynamic Type, such as when used with Custom Resources in vRA for self-service provisioning.  The workflow follows these high-level steps:

  1. Check if the object being processed is the parent object (databaseFolder) or the child object (database).
  2. If it is a databaseFolder creates a new Dynamic Type for a databaseFolder.
  3. If it is a database perform the activity required to locate the object using its id value, in our case, this is a REST API call to DMS to retrieve a single database.
  4. Perform any activities required to create the object and set its properties, in our case, this is extracting the database details from the REST API call results as DMS returns values such as the id and the name in a JSON object formatted as a string.

Find All Workflow

This workflow is automatically run whenever vRO needs to locate all instances of a Dynamic Type, such as when the Dynamic Types namespace is browsed in the vRO client when it is called as a sub workflow of the Find Relation workflow. The workflow follows these high-level steps:

  1. Check if the object being processed is the parent object (databaseFolder) or the child object (database).
  2. If it is a databaseFolder creates a new Dynamic Type for a databaseFolder.
  3. If it is a database perform the activity required to locate all instances of the objects, in our case, this is a REST API call to DMS to retrieve all databases.
  4. Perform any activities required to loop through each of the instances found. For each instance create an object and set its properties. In our case, this is extracting all of the database details from the REST API call results, looping through each one, and extracting values such as the id and the name.

Has Children in Relation Workflow

This workflow is used by vRO to determine whether it should expand the hierarchy when an object is selected in the Dynamic Types namespace within the vRO client. If an object has children objects these would be displayed underneath it in the namespace, in the same way as the databases are displayed under the databaseFolders.  The workflow follows these high-level steps:

  1. Check if the object being processed is the parent object (databaseFolder) or the child object (database) by checking its parentType and relationName values which are provided as workflow inputs.
  2. If it is a databaseFolder call the Find Relation workflow to retrieve all related objects.
  3. If it is any other object type set the result to false to indicate that there are no child objects related to the selected object to display in the hierarchy.

Find Relation Workflow

This workflow is used by vRO when an object is selected in the Dynamic Types namespace within the vRO client. If an object has children objects these would be displayed underneath it in the namespace, in the same way as the databases are displayed under the databaseFolders. vRO automatically runs this workflow each time the Dynamic Types namespace is browsed by an administrator to find any related objects it needs to display. The workflow follows these high-level steps:

  1. Check if the object being processed is the parent object (databaseFolder) or the child object (database) by checking its parentType and relationName values which are provided as workflow inputs.
  2. If it is a databaseFolder and the relationName value is “namespace-children” which is a special value assigned to the very top level in the selected namespace, then create a new Dynamic Type for a databaseFolder.
  3. If it is a database set the type to DMS.database and then call the Find All workflow to retrieve the Dynamic Type objects for all database instances

Creating a Dynamic Type

Defining a Namespace

The first stage in creating our Dynamic Type is to define a new Namespace. Continue reading How to create vRO Dynamic Types for vRA Custom Resources

Kubernetes

Kubernetes Troubleshooting – Kubelet Unable to attach or mount volumes – timed out waiting for the condition

The Issue

When I updated my Kasten application in my Kubernetes cluster, I found that one of the pods was stuck in “init” status.

dean@dean [ ~ ] (⎈ |tkg-wld-01-admin@tkg-wld-01:default) # k get pods -n kasten-io -w
NAME READY STATUS RESTARTS AGE
aggregatedapis-svc-78564d4697-wl9wg 1/1 Running 0 3m9s
auth-svc-7977b9684b-zph27 1/1 Running 0 3m11s
catalog-svc-7ff7779b75-kmvsr 0/2 Init:0/2 0 2m43s

kubectl get pods - status init

Running a describe on that pod pointed to the fact the volume could not be attached.

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m58s default-scheduler Successfully assigned kasten-io/catalog-svc-7ff7779b75-kmvsr to tkg-wld-01-md-0-54598b8d99-rpqjf
Warning FailedMount 55s kubelet Unable to attach or mount volumes: unmounted volumes=[catalog-persistent-storage], unattached volumes=[k10-k10-token-lbqpw catalog-persistent-storage]: timed out waiting for the condition
kubelet Unable to attach or mount volumes- unmounted volumes=[catalog-persistent-storage], unattached volumes=[k10-k10-token-lbqpw catalog-persistent-storage]- timed out waiting for the condition
The Cause

Some where along the line I found some stale volumeattachments linked to Kubernetes node that no longer exist in my cluster. This looks to be causing some confusion in the cluster who should be attaching the volume

The image below shows:

  • Find the Persistent Volume name linked to the associated claim for the failure in the pod events
  • Map this to the available VolumeAttachments
  • Reference VolumeAttachments for each node to available nodes in the cluster
    • I’ve highlighted the missing node in the red box

kubectl get pv - get volumeattachment - get nodes

The Fix

The fix is to remove the stale VolumeAttachment.

kubectl delete volumeattachment [volumeattachment_name]

kubectl delete volumeattachment

After this your pod should eventually pick up and retry, or you could remove the pod and let Kubernetes replace it for you (so long as it’s part of a deployment or other configuration managing your application).

Regards

Dean Lewis