Kubernetes Finalizers: Deep Dive into PVC Deletion

This article builds upon an old post I wrote many years ago:
Kubernetes PVC stuck in Terminating state. That post covered the symptoms and quick fixes.

This one is for platform engineers and Kubernetes operators who want to understand why resources like PVCs get stuck in Terminating, how Kubernetes handles deletion internally, and what it really means when a finalizer hangs around.

What Are Finalizers and Why Do They Matter?

In Kubernetes, deleting a resource is a two-phase operation. When a user runs
kubectl delete, the object is not immediately removed from etcd. Instead, Kubernetes sets a deletionTimestamp and, if finalizers are present, waits for them to be cleared before actually removing the resource from the API server.

Finalizers are strings listed in the metadata.finalizers array. Each one signals that a
controller must perform cleanup logic before the object can be deleted. This ensures consistency and is critical when external resources (cloud volumes, DNS records, firewall rules) are involved.

metadata:
  finalizers:
    - example.com/cleanup-hook

Until this list is empty, Kubernetes will not fully delete the object. This behavior is central to the garbage collection process and the reliability of resource teardown.

Deletion Flow Internals

Here’s what actually happens under the hood:

The user requests deletion (e.g. kubectl delete pvc my-claim)
Kubernetes sets metadata.deletionTimestamp but leaves the object in etcd
If metadata.finalizers is non-empty, deletion is paused
Each controller responsible for a finalizer must reconcile the object, complete cleanup, then remove its string from the list
Once the list is empty, the object is garbage collected

Visual Flow

[kubectl delete] → [deletionTimestamp set]
           ↓
[finalizers exist?] — No → resource deleted
           ↓
          Yes
           ↓
[Controller reconciles → does cleanup → removes finalizer]
           ↓
[All finalizers removed?] — No → Wait
           ↓
          Yes
           ↓
[Object deleted from etcd]

PVCs and the kubernetes.io/pvc-protection Finalizer

This finalizer is added by the PVC Protection Controller, a core Kubernetes controller
responsible for ensuring that a PVC isn’t deleted while it’s still in use by a Pod. It’s a guardrail that prevents accidental data loss.

To view it on a PVC:

kubectl get pvc my-claim -o yaml

You’ll see:

metadata:
  finalizers:
  - kubernetes.io/pvc-protection

As long as any Pod references that PVC, even if the Pod is Terminating, Kubernetes won’t remove the finalizer. This also applies if the Pod’s deletion is delayed due to a finalizer or node unavailability.

Why Finalizers Hang Around (and PVCs Get Stuck)

If the controller responsible for a finalizer crashes or is unavailable, it can’t remove its entry.

As a result, the resource stays in Terminating indefinitely. For PVCs, common culprits include:

Pods still referencing the PVC
Nodes being unresponsive (Pod can’t be torn down)
CSI driver failing to detach/unmount volumes
Stale VolumeAttachment objects lingering

To debug:

# Find referencing Pods
kubectl get pods --all-namespaces -o json | jq -r '
  .items[] | 
  select(.spec.volumes[]?.persistentVolumeClaim.claimName=="my-claim") | 
  "\(.metadata.namespace)/\(.metadata.name)"'

# Check VolumeAttachments
kubectl get volumeattachments

# Describe PVC for recent events
kubectl describe pvc my-claim

vSphere CSI: Finalizers and Cleanup Flow

I’m going to use the vSphere CSI as my real-world example’s for looking at a CSI in Kubernetes, as it’s the one I’ve spent the most time with troubleshooting.

The vSphere CSI driver utilizes the external-attacher/csi-vsphere-vmware-com finalizer on PersistentVolume (PV) objects. This finalizer ensures that the CSI external-attacher completes necessary cleanup operations before the PV is deleted.

If this finalizer remains on a PV, it can prevent the PV from being fully deleted, especially if the corresponding VolumeAttachment object still exists. In such cases, manual intervention may be required to remove the finalizer and delete the PV.

For example, in Issue #266, a user encountered a situation where a PV couldn’t be deleted due to the lingering finalizer. The recommended workaround involved manually detaching the disk, removing the finalizer from the VolumeAttachment and PV, and then deleting the PV.

Example: vSphere CSI Log on Failed Volume Unmap

E0324 04:21:58.987894 nestedpendingoperations.go:301] 
Operation for \"{volumeName:kubernetes.io/csi/csi.vsphere.vmware.com^pvc-1234...}\" failed. 
Error: \"UnmapVolume.UnmapBlockVolume failed: blkUtil.DetachFileDevice failed.\"

This log line shows a failed volume unmap operation, one reason PVC deletion might hang. These issues are common with block-mode volumes and can often be resolved by forcing detach or recycling the node.

Some tips and ideas

Never remove finalizers blindly—they exist for a reason. Manual removal is only valid after confirming no side effects.
Use readiness/liveness probes to ensure Pods cleanly terminate, helping PVCs detach properly.
Monitor VolumeAttachments with alerts if they remain after PVCs are deleted.
Build automation to identify stuck resources using kubectl get all -o json piped into custom jq scripts.

Conclusion

Finalizers play a critical role in Kubernetes’ safety and consistency guarantees. They ensure cleanup nhappens before resource deletion, but when mismanaged, or if a controller crashes, they can leave resources like PVCs hanging indefinitely.

By understanding the internals of how finalizers interact with deletion, controllers, and etcd, you gain the power to confidently debug and resolve these issues in complex environments. And with CSI drivers like vSphere, knowing the exact role and behavior of both PVC finalizers and custom CRD finalizers is key to long-term platform resilience.

Regards

Follow me on Bluesky

Dean Lewis

vEducate.co.uk

Fixing issues and blogging

Leave a Reply Cancel reply