Tag Archives: Driver

Kubernetes

vSphere CSI Driver Images unable from gcr.io – quick fix

The Issue

Someone has deleted the Cloud-Provider-vSphere project in the gcr.io registry for container images. The default pull policy for the vSphere CSI when using VMware’s manifests is set to always, meaning that if you reboot your cluster, it will not come back online.

vSphere-CSI Driver image unable - project deleted

This is what my cluster looked like when I booted it up today;

❯ kubectl get pods -n vmware-system-csi
NAME READY STATUS RESTARTS AGE
vsphere-csi-controller-776fb75cd8-ptw4s 5/7 ErrImagePull 0 84m
vsphere-csi-controller-776fb75cd8-qt7kv 5/7 ImagePullBackOff 0 84m
vsphere-csi-controller-776fb75cd8-s7btf 5/7 ImagePullBackOff 0 84m
vsphere-csi-node-5qjjw 1/3 CrashLoopBackOff 80 (111s ago) 142d
vsphere-csi-node-fmdkz 2/3 ImagePullBackOff 84 (3m5s ago) 143d
vsphere-csi-node-gbt9w 1/3 CrashLoopBackOff 6 (26s ago) 5m56s
vsphere-csi-node-jkj98 1/3 CrashLoopBackOff 86 (24s ago) 143d
vsphere-csi-node-r69bl 1/3 CrashLoopBackOff 85 (102s ago) 143d
vsphere-csi-node-ww2zx 2/3 ImagePullBackOff 89 (3m5s ago) 143d

And when describing the pod;

❯ kubectl describe pod -n vmware-system-csi vsphere-csi-controller-776fb75cd8-ptw4s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 85m default-scheduler 0/6 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 4 node(s) were unschedulable. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
Warning FailedScheduling 84m default-scheduler 0/6 nodes are available: 6 node(s) were unschedulable. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
Warning FailedScheduling 6m54s default-scheduler 0/6 nodes are available: 6 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
Normal Scheduled 6m27s default-scheduler Successfully assigned vmware-system-csi/vsphere-csi-controller-776fb75cd8-ptw4s to talos-2tp-6ld
Normal Created 6m26s kubelet Created container liveness-probe
Warning Failed 6m26s kubelet Failed to pull image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v3.0.0": failed to pull and unpack image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v3.0.0": failed to resolve reference "gcr.io/cloud-provider-vsphere/csi/release/syncer:v3.0.0": failed to authorize: failed to fetch anonymous token: unexpected status from GET request to https://gcr.io/v2/token?scope=repository%3Acloud-provider-vsphere%2Fcsi%2Frelease%2Fsyncer%3Apull&service=gcr.io: 401 Unauthorized
Normal Started 6m26s kubelet Started container csi-attacher
Normal Pulled 6m26s kubelet Container image "k8s.gcr.io/sig-storage/csi-resizer:v1.7.0" already present on machine
Normal Created 6m26s kubelet Created container csi-resizer
Normal Started 6m26s kubelet Started container csi-resizer
Normal Pulling 6m26s kubelet Pulling image "gcr.io/cloud-provider-vsphere/csi/release/driver:v3.0.0"
Warning Failed 6m26s kubelet Failed to pull image "gcr.io/cloud-provider-vsphere/csi/release/driver:v3.0.0": failed to pull and unpack image "gcr.io/cloud-provider-vsphere/csi/release/driver:v3.0.0": failed to resolve reference "gcr.io/cloud-provider-vsphere/csi/release/driver:v3.0.0": failed to authorize: failed to fetch anonymous token: unexpected status from GET request to https://gcr.io/v2/token?scope=repository%3Acloud-provider-vsphere%2Fcsi%2Frelease%2Fdriver%3Apull&service=gcr.io: 401 Unauthorized
Warning Failed 6m26s kubelet Error: ErrImagePull
Normal Pulled 6m26s kubelet Container image "k8s.gcr.io/sig-storage/livenessprobe:v2.9.0" already present on machine
Normal Pulled 6m26s kubelet Container image "k8s.gcr.io/sig-storage/csi-attacher:v4.2.0" already present on machine
Normal Started 6m26s kubelet Started container liveness-probe
Normal Pulling 6m26s kubelet Pulling image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v3.0.0"
Normal Created 6m26s kubelet Created container csi-attacher
Warning Failed 6m26s kubelet Error: ErrImagePull
Normal Pulled 6m26s kubelet Container image "k8s.gcr.io/sig-storage/csi-provisioner:v3.4.0" already present on machine
Normal Created 6m26s kubelet Created container csi-provisioner
Normal Started 6m25s kubelet Started container csi-provisioner
Normal Pulled 6m25s kubelet Container image "k8s.gcr.io/sig-storage/csi-snapshotter:v6.2.1" already present on machine
Normal Created 6m25s kubelet Created container csi-snapshotter
Normal Started 6m25s kubelet Started container csi-snapshotter
Warning Failed 6m24s kubelet Error: ImagePullBackOff
Normal BackOff 6m24s kubelet Back-off pulling image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v3.0.0"
Warning Failed 6m24s kubelet Error: ImagePullBackOff
Normal BackOff 83s (x21 over 6m24s) kubelet Back-off pulling image "gcr.io/cloud-provider-vsphere/csi/release/driver:v3.0.0"
The Cause

Who knows? Maybe it cost Broadcom too much to host the images in Google Cloud. Or maybe they are moving to a model where you can only access the files when you pay for VCF.

The Workaround

Luckily the images are mirrored by Rancher, so I just updated the vSphere CSI manifest from:

– https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v3.3.1/manifests/vanilla/vsphere-csi-driver.yaml

And updated the image locations, you can get the updated file from my GitHub Gist belo. Continue reading vSphere CSI Driver Images unable from gcr.io – quick fix

vSphere Kubernetes Drivers Operator - Red Hat OpenShift - Header

Using the new vSphere Kubernetes Driver Operator with Red Hat OpenShift via Operator Hub

What is the vSphere Kubernetes Driver Operator (VDO)?

This Kubernetes Operator has been designed and created as part of the VMware and IBM Joint Innovation Labs program. We also talked about this at VMworld 2021 in a joint session with IBM and Red Hat. With the aim of simplifying the deployment and lifecycle of VMware Storage and Networking Kubernetes driver plugins on any Kubernetes platform, including Red Hat OpenShift.

This vSphere Kubernetes Driver Operator (VDO) exposes custom resources to configure the CSI and CNS drivers, and using Go Lang based CLI tool, introduces validation and error checking as well. Making it simple for the Kubernetes Operator to deploy and configure.

The Kubernetes Operator currently covers the following existing CPI, CSI and CNI drivers, which are separately maintained projects found on GitHub.

This operator will remain CNI agnostic, therefore CNI management will not be included, and for example Antrea already has an operator.

Below is the high level architecture, you can read a more detailed deep dive here.

vSphere Kubernetes Drivers Operator - Architecture Topology

Installation Methods

You have two main installation methods, which will also affect the pre-requisites below.

If using Red Hat OpenShift, you can install the Operator via Operator Hub as this is a certified Red Hat Operator. You can also configure the CPI and CSI driver installations via the UI as well.

  • Supported for OpenShift 4.9 currently.

Alternatively, you can install the manual way and use the vdoctl cli tool, this method would also be your route if using a Vanilla Kubernetes installation.

This blog post will cover the UI method using Operator Hub.

Pre-requisites

Continue reading Using the new vSphere Kubernetes Driver Operator with Red Hat OpenShift via Operator Hub