Tag Archives: Error

Tanzu Blog Logo Header

Tanzu Kubernetes Grid 1.6 – Management Cluster deployment failure – unable to patch the cluster object

The Issue

When deploying a brand new Tanzu Kubernete Grid Management Cluster to a vSphere environment we kept hitting failures like the below. The deployment was very vanilla with the default settings, no extra metadata inputted into the build.

!! [1223 15:26:17.84239]: init.go:732] Failure while deploying management cluster, Here are some steps to investigate the cause:
!! [1223 15:26:17.84256]: init.go:733] Debug:
!! [1223 15:26:17.84262]: init.go:734] kubectl get po,deploy,cluster,kubeadmcontrolplane,machine,machinedeployment -A --kubeconfig /home/michael/.kube-tkg/tmp/config_Qd01VhPd
!! [1223 15:26:17.84272]: init.go:735] kubectl logs deployment.apps/ -n  manager --kubeconfig /home/michael/.kube-tkg/tmp/config_Qd01VhPd
!! [1223 15:26:17.84278]: init.go:738] To clean up the resources created by the management cluster:
!! [1223 15:26:17.84283]: init.go:739] tanzu management-cluster delete
✘ [1223 15:26:17.84291]: init.go:91] unable to set up management cluster, : unable to patch cluster object: unable to patch optional metadata under labels: unable to patch the management cluster object with optional metadata: unable to patch the cluster object: error while applying patch for "&TypeMeta{Kind:,APIVersion:,}" tkg-system/tkg-mgmt-vsphere-20221223151757: Cluster.cluster.x-k8s.io "tkg-mgmt-vsphere-20221223151757" is invalid: [metadata.labels: Invalid value: "": name part must be non-empty, metadata.labels: Invalid value: "": name part must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName', or 'my.name', or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]')]

The Cause

The tooling creates an erronous value in the cluster config file, which causes the build error.

The Fix

Search for the latest yaml file created in:

~/.config/tanzu/tkg/clusterconfigs/

and comment out the following line:

CLUSTER_LABELS: :,

# The line will now look like this:

#CLUSTER_LABELS: :,

Now re-run the creation of your cluster using the CLI

tanzu mc create --file {file_name.yaml}

Regards

Dean Lewis

RH OCP Header

OpenShift – Cluster-Monitoring-Operator Pod Error – cannot verify user is non-root

The issue

After building a brand new OpenShift 4.6.9 cluster, I noticed one of the pods was not running correctly

oc get pods -n openshift-monitoring
.....
NAME READY   STATUS                       RESTARTS   AGE
cluster-monitoring-operator-f85f7bcb5-84jw5 1/2 CreateContainerConfigError 0 112m

Upon inspection of the pod;

oc describe pod cluster-monitoring-operator-XXX -n openshift-
monitoring

I could see the following error message;

Error: container has runAsNonRoot and image has non-numeric user
(nobody), cannot verify user is non-root

The Cause

There is a Red Hat article about this, but it is gated. The reason is cluster-monitoring-operator gets wrongly the non-root SCC assigned.

The Fix

Currently there is no permanent provided fix from Red Hat, but you can track this bug.

However the workaround is to simply delete the pod. This should recreate and load with the correct permissions.

Regards

LCM Migration vRSLCM Easy installer5

vRealize LifeCycle Manager – New License – Exception while loading DLF

Adding a new license into vRLCM locker fails with;

Exception while loading DLF. Check /var/log/vlcm for more detail

Sorry I didn’t take a screenshot of this in the UI.

In the log file, you will see the error code;

LCMLICENSINGCONFIG11005

Over all the logs are not very helpful;

INFO [pool-2-thread-12] c.v.v.l.p.a.s.Task - -- Injecting task failure event. Error Code : 'LCMLICENSINGCONFIG11005', Retry : 'true', Causing Properties : '{ CAUSE :: }' 
com.vmware.vrealize.lcm.plugin.core.licensing.common.exception.ValidateLicensingException: Exception while loading DLF. Check logs for more detail
at com.vmware.vrealize.lcm.plugin.core.licensing.task.ValidateLicenseTask.execute(ValidateLicenseTask.java:137) [vmlcm-licensingplugin-core-2.1.0-SNAPSHOT.jar!/:?]
at com.vmware.vrealize.lcm.automata.core.TaskThread.run(TaskThread.java:45) [vmlcm-engineservice-core-2.1.0-SNAPSHOT.jar!/:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]

The Fix

Reboot the vRLCM appliance.

Regards

Dean

VMware Cloud Foundation Header

Nested VCF Lab – Error while creating NFS datastore

Whilst deploying my nested VCF environment for my home lab, I kept hitting the same issue over and over again, even when I rolled the environment back and redeployed it.

VCF bring up error while creating NFS datastore

Error while creating NFS Datastore for host XXX.XXX.XXX.XXX

Looking into the debug log files on the Cloud Builder appliance found in the below location;

vcf-bringup-debug.log /var/log/vmware/vcf/bringup/

You can see basically the same error message, and not much help.

ERROR [c.v.e.s.o.model.error.ErrorFactory,pool-3-thread-7] [TP9EK1] VCF_HOST_CREATE_NFS_DATASTORE_FAILED

And the log ends with the below comments, I’ve left my Task ID numbers in, but obviously these are unique to my bring up;

DEBUG [c.v.e.s.o.c.ProcessingTaskSubscriber,pool-3-thread-7] Collected the following errors for task with name CreateNFSDatastoreOnHostsAction and ID 7f000001-6ed0-12cd-816e-d1f7a33f006f: [ExecutionError [errorCode=null, errorResponse=LocalizableErrorResponse(messageBundle=com.vmware.vcf.common.fsm.plugins.action.hostmessages)]]

DEBUG [c.v.e.s.o.c.ProcessingTaskSubscriber,pool-3-thread-19] Invoking task CreateNFSDatastoreOnHostsAction.UNDO Description: Mount Repository NFS Datastore on ESXi Hosts, Plugin: HostPlugin, ParamBuilder null, Input map: {hosts=SDDCManagerConfiguration____13__hosts, nasDatastoreName=SDDCManagerConfiguration____13__nasDatastoreName, nfsRepoDirPath=SDDCManagerConfiguration____13__nfsRepoDirPath, repoVMIp=SDDCManagerConfiguration____13__repoVMIp}, Id: 7f000001-6ed0-12cd-816e-d1f7a33f006e ...

DEBUG [c.v.e.s.o.c.c.ContractParamBuilder,pool-3-thread-19] Contract task Mount Repository NFS Datastore on ESXi Hosts input: {"hosts":[{"address":"172.18.30.10","username":"root","password":"*****"},{"address":"172.18.30.11","username":"root","password":"*****"},{"address":"172.18.30.12","username":"root","password":"*****"},{"address":"172.18.30.13","username":"root","password":"*****"}],"nasDatastoreName":"lcm-bundle-repo","nfsRepoDirPath":"/nfs/vmware/vcf/nfs-mount","repoVMIp":"172.18.30.50"}

DEBUG [c.v.e.s.o.c.ProcessingTaskSubscriber,pool-3-thread-19] Collected the following errors for task with name CreateNFSDatastoreOnHostsAction and ID 7f000001-6ed0-12cd-816e-d1f7a33f006f: [ExecutionError [errorCode=null, errorResponse=LocalizableErrorResponse(messageBundle=com.vmware.vcf.common.fsm.plugins.action.hostmessages)]]

WARN  [c.v.e.s.o.c.ProcessingOrchestratorImpl,pool-3-thread-10] Processing State completed with failure

INFO  [c.v.e.s.o.core.OrchestratorImpl,pool-3-thread-15] End of Orchestration with FAILURE for Execution ID 8c9c5ab1-e48a-414e-9c4d-8936e6f12c91

The Fix

I struggled with this one for a while, at first I considered maybe an IP address conflict with the SDDC manager appliance, but it wasn’t that, I had the same issue after trying again with a different IP address.

I discussed this with our internal support, and I was pointed to the direction of KB 1005948.

When I followed the article, I noticed that the default vmkernel used to access my subnet and the subnet of my SDDC manager was VMK2, which is assigned for VSAN traffic; Continue reading Nested VCF Lab – Error while creating NFS datastore

LCM Migration vRSLCM Easy installer5

VMware LifeCycle Manager – Migration error “SSH is not enabled or invalid” – LCMMIGRATION15102

During my migration from vRSLCM 2.1 patch 2 to the latest version 8 release, I encountered the following error;

Error Code: LCMMIGRATION15102

vRSLCM Migration Failed with SSH is not enabled or Root credential invalid. Please make sure SSH is enabled or porvide the correct root credential by adding the credential to the home page locker app

Pretty obvious error, however the provided root credentials were correct, and I could use putty to connect to my existing LCM instance.

LCM Migration Error 1 15102

The fix

Continue reading VMware LifeCycle Manager – Migration error “SSH is not enabled or invalid” – LCMMIGRATION15102