Learn KubeVirt: Deep Dive for VMware vSphere Admins

As a vSphere administrator, you’ve built your career on understanding infrastructure at a granular level, datastores, DRS clusters, vSwitches, and HA configurations. You’re used to managing VMs at scale. Now, you’re hearing about KubeVirt, and while it promises Kubernetes-native VM orchestration, it comes with a caveat: Kubernetes fluency is required. This post is designed to bridge that gap, not only explaining what KubeVirt is, but mapping its architecture, operations, and concepts directly to vSphere terminology and experience. By the end, you’ll have a mental model of KubeVirt that relates to your existing knowledge.

Contents hide

1 What is KubeVirt?

1.1 Core Building Blocks of KubeVirt, Mapped to vSphere

2 KVM + QEMU: The Hypervisor Stack

3 Storage: PVCs, Block Mode, and Live Migration

3.1 Mapping Concepts

3.2 How VMs Access Storage in KubeVirt

3.3 Live Migration: What Changes

3.4 Common Storage Topologies for KubeVirt

3.4.1 1. Shared Filesystem (RWX) – e.g. NFS, CephFS

3.4.2 2. Shared Block Storage (RWX via CSI) – e.g. Portworx, OpenEBS Mayastor, Ceph RBD with RWX support

3.4.3 3. Block Storage (RWO + Block Mode) – e.g. Ceph RBD, iSCSI

3.4.4 4. Local Storage (RWO + Filesystem) – e.g. HostPath, LVM, local disks

3.4.4.1 Manual Migration of a VM Using Local Storage

3.4.5 5. Object Storage for Templates/Import – e.g. S3, MinIO

3.5 Storage Design Tip

3.6 Other Considerations and Notes

3.7 Storage Summary

4 Networking: Multus, Bridge Mode, and Pod Network

4.1 VMware vs KubeVirt Networking: Feature Comparison

4.2 Default Pod Network (eth0)

4.3 Multus: Secondary Networks

4.4 Bridge Binding

4.5 SR-IOV: High Performance Networking

4.6 Masquerade, Bridge, and Slirp Modes (Interface Bindings)

4.7 Advanced Networking Considerations

4.7.1 Firewalling and Microsegmentation

4.8 Networking Summary

5 VM Lifecycle: Creating, Starting, and Operating

5.1 Common VM Lifecycle Operations: vSphere vs KubeVirt

5.2 How VM Definitions Work in KubeVirt

5.2.1 Example: Adding a New Network Interface to a Running VM

5.2.2 Example: Expanding a VM Disk (PVC)

5.3 Operational Mindset Shift

5.4 Tooling and Interacting with KubeVirt

5.4.1 kubectl vs virtctl: When to Use Each Tool

6 Day 2 Operations: What Changes?

6.1 Comparison: Day 2 Operations in vSphere vs KubeVirt

6.2 Monitoring VMs and Infrastructure

6.3 Troubleshooting VMs

6.4 Patch Management and Upgrades

6.5 Security and Compliance

6.6 Key Operational Changes for vSphere Admins

7 Open Source vs Enterprise Support: Real-World Deployment Considerations

8 Conclusion: Understanding the Trade-offs

8.1 VMware vSphere: Pros and Cons

8.2 KubeVirt: Pros and Cons

8.3 Key Considerations Before Moving Platforms

What is KubeVirt?

KubeVirt is a Kubernetes extension that allows you to run traditional virtual machines inside a Kubernetes cluster using the same orchestration primitives you use for containers. Under the hood, it leverages KVM (Kernel-based Virtual Machine) and QEMU to run the VMs (more on that futher down).

Kubernetes doesn’t replace the hypervisor, it orchestrates it. Think of Kubernetes as the vCenter equivalent here: managing the control plane, networking, scheduling, and storage interfaces for the VMs, with KubeVirt as a plugin that adds VM resource types to this environment.

Tip: KubeVirt is under active development; always check latest docs for feature support.

Core Building Blocks of KubeVirt, Mapped to vSphere

KubeVirt Concept	vSphere Equivalent	Description
`VirtualMachine` (CRD)	VM Object in vCenter	The declarative spec for a VM in YAML. It defines the template, lifecycle behaviour, and metadata.
`VirtualMachineInstance` (VMI)	Running VM Instance	The live instance of a VM, created and managed by Kubernetes. Comparable to a powered-on VM object.
virt-launcher	ESXi Host Process	A pod wrapper for the VM process. Runs QEMU in a container on the node.
PersistentVolumeClaim (PVC)	VMFS Datastore + VMDK	Used to back VM disks. For live migration, either ReadWriteMany PVCs or RAW block-mode volumes are required, depending on the storage backend.
Multus + CNI	vSwitch, Port Groups, NSX	Provides networking to VMs. Multus enables multiple network interfaces. CNIs map to port groups.
Kubernetes Scheduler	DRS	Schedules pods (including VMIs) across nodes. Lacks fine-tuned VM-aware resource balancing unless extended.
Live Migration API	vMotion	Live migration of VMIs between nodes with zero downtime. Requires shared storage and certain flags.
Namespaces	vApp / Folder + Permissions	Isolation boundaries for VMs, including RBAC policies.

KVM + QEMU: The Hypervisor Stack

Unlike vSphere, which uses the ESXi hypervisor with its tightly integrated kernel modules and management daemons, KubeVirt relies on a Linux-native hypervisor stack consisting of KVM (Kernel-based Virtual Machine), QEMU (Quick Emulator), and libvirt.

Here’s how each component fits into the stack:

KVM: A kernel module that exposes virtualization extensions (Intel VT-x or AMD-V) to user space, allowing hardware-accelerated VM execution.
QEMU: The userspace emulator and virtual machine monitor that interfaces with KVM to run guest operating systems.
libvirt: A daemon and API abstraction layer that manages QEMU and KVM. It handles tasks such as starting, stopping, migrating, and configuring VMs.

libvirt acts as the orchestration layer on each node, similar in concept to how hostd or vpxa functions on ESXi hosts. In the KubeVirt model, libvirt is not exposed directly to the user; instead, it’s controlled by the KubeVirt components via the virt-handler daemon.

                     Kubernetes Node (Worker Node)
+---------------------------------------------------------------+
|                      KubeVirt Components                      |
|                                                               |
|  +------------------+       +------------------------------+  |
|  |   virt-handler    | <--->|         libvirtd daemon      |  |
|  +------------------+       +------------------------------+  |
|           |                            |                     |
|           |                            v                     |
|           |                   +------------------+           |
|           |                   |    QEMU Process   |           |
|           |                   | (inside pod via   |           |
|           |                   |  virt-launcher)   |           |
|           |                   +------------------+           |
|           |                            |                     |
|           |                            v                     |
|           |                    Hardware Virtualization       |
|           |                        (via KVM)                 |
+---------------------------------------------------------------+

Each VM is run inside a special pod called virt-launcher, which spins up a QEMU process. This QEMU process communicates with libvirt locally on the node. The virt-handler is the component that talks to Kubernetes and orchestrates the lifecycle of VMIs by issuing commands to libvirt.

Here’s what happens when you launch a VM in KubeVirt:

You create a VirtualMachine resource in Kubernetes.
The virt-controller detects this and schedules a virt-launcher pod to a node.
Once on the node, virt-handler takes over and communicates with libvirt to start a QEMU process inside the virt-launcher pod.
KVM provides the necessary hardware acceleration to the QEMU process.

libvirt is also responsible for features such as live migration, snapshotting, and CPU pinning—though these features must be surfaced via KubeVirt’s API to be usable in a Kubernetes-native way. It acts as the “hypervisor operations backend” for KubeVirt, much like how ESXi performs operations at the host level on behalf of vCenter.

As a vSphere admin, you can think of libvirt as the underlying hypervisor interface that KubeVirt wraps and controls, abstracting away direct hypervisor interaction in favour of Kubernetes-native APIs and workflows.

Storage: PVCs, Block Mode, and Live Migration

In a VMware vSphere environment, storage is managed through abstractions like datastores (VMFS, NFS, vSAN), and virtual disks are provisioned as .vmdk files. You assign these to VMs and manage availability, performance, and policies via Storage Policy-Based Management (SPBM). You also expect storage to be available to all hosts in a cluster to support features like vMotion and HA.

KubeVirt’s approach to VM storage is fundamentally different. Rather than mounting datastores at the hypervisor level, KubeVirt leverages Kubernetes’ Persistent Volume (PV) and Persistent Volume Claim (PVC) model. These are abstractions that describe how storage is provisioned and consumed by workloads, including VMs. The underlying volume can be block or file-based, provisioned statically or dynamically, and backed by any Kubernetes-compatible CSI (Container Storage Interface) driver.

A Kubernetes CSI (Container Storage Interface) driver acts as the translation layer between Kubernetes and the underlying storage system. It provides the logic for Kubernetes to dynamically provision volumes, attach or detach storage to pods (including VMs via KubeVirt), and handle volume lifecycle operations such as resizing and snapshotting.

In vSphere terms, a CSI driver is similar to how VMware’s Storage APIs (VAAI, VASA) interact with datastores, storage arrays, and vSphere’s Storage Policy Based Management (SPBM). Just as vSphere relies on storage providers (e.g., VASA providers for vSAN, Pure, or NetApp) to orchestrate volume operations behind the scenes, Kubernetes relies on CSI drivers to standardize storage operations across different vendors. Each CSI driver exposes features like dynamic provisioning, volume expansion, and snapshotting based on what the underlying storage platform supports. When deploying KubeVirt, the choice and configuration of the CSI driveris very critical to ensuring VM performance, live migration capability, and data protection strategies.

Mapping Concepts

KubeVirt / Kubernetes	vSphere	Description
PersistentVolume (PV)	Datastore (VMFS/NFS)	The underlying physical or network-attached volume available to the cluster.
PersistentVolumeClaim (PVC)	VMDK File	The storage unit “claimed” by a VM. Acts like a virtual disk or VMDK mapped to the VM.
StorageClass	Storage Policy	Defines how volumes are provisioned (e.g., thin/thick, replicated), like SPBM in vSphere. This also defines the CSI driver to use, which is configured to the underlying phsyical storage.
volumeMode: Filesystem	VMFS, NFS	Mounts a formatted filesystem into the VM pod. Only usable if ReadWriteOnce (RWO) or ReadWriteMany (RWX) access is possible.
volumeMode: Block	Raw Device Mapping (RDM)	Presents raw block device to VM, similar to RDMs or raw disks in VMware.

Note: In Kubernetes storage terminology, ReadWriteOnce (RWO) means the volume can be mounted as read-write by only a single node at a time. This is typical for most block storage backends. ReadWriteMany (RWX) means the volume can be mounted as read-write by multiple nodes simultaneously, which is necessary for scenarios like live migration in KubeVirt where both the source and destination nodes need concurrent access to the VM’s disk.

How VMs Access Storage in KubeVirt

When you define a VirtualMachine in KubeVirt, you attach one or more volumes backed by PVCs. These PVCs are bound to PVs that represent actual disks in your storage backend—this could be iSCSI, Ceph RBD, NFS, Fibre Channel, Portworx, or any other CSI-compatible storage system.

Volumes are attached inside the virt-launcher pod, and then passed through to the QEMU process as virtual disks (virtio or SATA, depending on config). These behave similarly to adding a VMDK to a VM’s SCSI or SATA controller in vSphere.

Live Migration: What Changes

In vSphere, vMotion, the ability to migrate a running VM between nodes in a cluster, is possible, as ESXi hosts in a cluster have access to the same shared storage and common networking between them. The VMDK does not need to be moved—only the memory and device state of the VM is transferred across hosts. (There is some finer nuances to this statement, such as in the case of shared nothing vMotion). Ultimately, the same concepts applies in KubeVirt, but with more explicit configuration.

Live migration in KubeVirt requires:

The VM’s disks must be on storage accessible to both source and destination nodes.
The PVCs must support either:
- ReadWriteMany (RWX) access mode – typical for CephFS, NFS, or Portworx shared volumes.
- Block-mode volumes (volumeMode: Block) – typically backed by Ceph RBD or raw iSCSI volumes.

If your PVC is using the default filesystem mode and only supports ReadWriteOnce (RWO), live migration will fail, because the destination node cannot mount the volume while it is in use.

Common Storage Topologies for KubeVirt

Just as vSphere environments are typically designed around shared VMFS datastores, NFS mounts, or vSAN clusters, KubeVirt requires careful planning around how storage is presented to the Kubernetes cluster and its nodes. Below are several common storage topologies used with KubeVirt, with pros, cons, and practical considerations.

1. Shared Filesystem (RWX) – e.g. NFS, CephFS

This topology is equivalent to mounting an NFS datastore on all ESXi hosts. A shared POSIX-compliant filesystem (a filesystem that follows standard UNIX file operation rules for permissions, locking, and file handling, ensuring compatibility across systems) is mounted and accessed concurrently by multiple nodes.

The volume is presented to Kubernetes with ReadWriteMany (RWX) mode, allowing VM live migration and multiple pod access.

Use cases: Simplified backup (file-based), live migration support, multi-node access.
Pros: Easy to configure, supports snapshots, compatible with most storage backends.
Cons: Latency-sensitive applications may suffer. Requires tuning to avoid locking issues.

2. Shared Block Storage (RWX via CSI) – e.g. Portworx, OpenEBS Mayastor, Ceph RBD with RWX support

Here, the storage backend provides distributed block storage accessible from multiple nodes. The underlying block device may be replicated across the cluster, and the CSI driver handles the synchronization. This supports both RWX and Block volume modes depending on the CSI features.

Use cases: Stateful VMs with HA, live migration with block devices, fault tolerance.
Pros: High performance, supports filesystem or raw block, survives node failure.
Cons: Requires CSI that supports RWX or replication, more complex configuration.

3. Block Storage (RWO + Block Mode) – e.g. Ceph RBD, iSCSI

This topology is akin to using raw device mappings (RDM) in vSphere. The volume is presented to a single node at a time, as a raw block device, using volumeMode: Block. Live migration is still possible because the volume is not mounted with a filesystem and can be passed between nodes if the storage supports multi-attachment.

Use cases: High-throughput workloads, database VMs, VMs requiring live migration without RWX.
Pros: Low overhead, no filesystem management, good for IOPS-intensive workloads.
Cons: Only one node can access at a time. Migration depends on backend compatibility.

4. Local Storage (RWO + Filesystem) – e.g. HostPath, LVM, local disks

Each node uses its own local disk or partition, and volumes are tied to the node where they are created. This is like placing a VM on a local-disk-only ESXi host: fast, but not portable.

Use cases: Dev/test clusters, workloads that don’t require migration or HA.
Pros: Simple to deploy, good performance, no external dependencies.
Cons: No live migration, no HA, data loss if node fails.

Manual Migration of a VM Using Local Storage

If a VM in KubeVirt is using local storage (RWO + Filesystem), live migration is not possible because the data is tied to the specific node. To migrate the VM to another node, you must perform a manual process with controlled downtime:

Shutdown the VM: Gracefully power off the VirtualMachineInstance (VMI) to ensure disk consistency.

virtctl stop vm-name

Backup or copy the VM disk: Manually copy the PVC data from the source node to a destination that is accessible by the target node. Options include:
- Use kubectl cp if HostPath is accessible.
- Use rsync or scp directly between nodes for raw disk files (e.g., qcow2 or raw images).

Recreate the PVC on the target node: Create a new PVC pointing to the new local path on the destination node. Make sure the nodeAffinity setting binds it to the correct node.</

5. Object Storage for Templates/Import – e.g. S3, MinIO

While not used for VM root disks directly, object storage is often used in conjunction with CDI (Containerized Data Importer) to import VM images, snapshots, or ISOs. Think of it as your vSphere Content Library or image datastore.

Use cases: Template management, ISO boot media, VM cloning.
Pros: Decouples storage from compute, easily scalable, supports CI/CD workflows.
Cons: Requires CDI integration, slower for hot data access.

Storage Design Tip

In vSphere, storage planning focuses on performance tiers and redundancy. In KubeVirt, the same is true, but you also need to consider how storage interacts with Kubernetes primitives like volume claims, access modes, and CSI driver capabilities. Not all backends support ReadWriteMany, and not all support volumeMode: Block. Match your topology to your operational goals—especially if you plan to support live migration or automatic rescheduling.

Other Considerations and Notes

StorageClass defines provisioning behaviour. Thin provisioning, volume expansion, and IOPS caps are controlled here, similar to SPBM policies in vSphere.
CDI (Containerized Data Importer) is often used to import VM disk images (e.g., QCOW2, ISO) into PVCs. This is your equivalent of cloning a template or deploying from an OVA.
Snapshots are available if your CSI driver supports them, enabling backups or templated deployments.
Hot-add and hot-remove of disks is supported for VMIs through Kubernetes patch operations, though not all combinations of backends support it cleanly.

Storage Summary

In vSphere, storage is abstracted and well-integrated into the virtual infrastructure stack, with a rich UI and robust automation. In KubeVirt, storage management is explicit and decoupled via Kubernetes. You gain flexibility, portability, and integration into GitOps workflows, but must also take care in selecting the right accessModes and volumeMode for your use case.

If live migration, high availability, or dynamic scaling are requirements, choosing the correct backend and configuring it properly in your StorageClasses and PVCs is essential. Once understood, KubeVirt’s storage model offers similar parallels to vSphere, but exposes more of the underlying plumbing, something platform engineers will appreciate.

Networking: Multus, Bridge Mode, and Pod Network

In vSphere, networking revolves around virtual switches (vSwitches), port groups, VLAN-backed segments, and distributed switches managed via vCenter and optionally NSX. Each VM is assigned one or more virtual NICs (vNICs), which are then connected to port groups that map to physical or overlay networks. KubeVirt has a similar model, but it builds on top of Kubernetes networking primitives, with extra flexibility (and complexity) through the use of plugins like Multus.

VMware vs KubeVirt Networking: Feature Comparison

Networking Feature	VMware vSphere	KubeVirt	Notes
Basic virtual switch	vSS (Standard Switch)	Pod Network via default CNI (e.g., OVN, Calico)	Default pod network connects VMs similarly to a basic vSwitch
Distributed virtual switch	vDS (Distributed Switch)	Multus + bridge/macvlan CNI on each node	Multus mimics vDS by attaching VMs to consistent named networks across nodes
Port group (access VLAN)	Named port group with VLAN ID	NetworkAttachmentDefinition + CNI config (e.g., macvlan with VLAN)	VLAN ID and interface behavior defined in YAML as part of CNI config
Port group (trunked)	Port group with VLAN trunk range	SR-IOV or DPDK-enabled interfaces with VF VLAN trunking	Depends on SR-IOV NIC driver and host config (e.g., VF passthrough)
vNIC assignment	GUI or automation (e.g., add vNIC on VM settings)	YAML definition using `interfaces` + `networks` in VM spec	Multiple interfaces defined declaratively; no GUI unless using a dashboard
MAC address allocation	Auto-assigned or static via vNIC settings	Static or generated in YAML via `macAddress` field	Admin must ensure uniqueness if using static assignment
VLAN tagging	Configured at port group level or within guest	Defined per network interface in CNI plugin config	macvlan/bridge plugins support VLAN via JSON config fields
NIC passthrough	SR-IOV or DirectPath I/O	SR-IOV with VF assignment to VMs via Multus	Requires SR-IOV CNI plugin and pre-configured VFs on hosts
Network segmentation (L2/L3)	vLANs, VXLANs (via NSX)	Multiple CNIs, flat or overlay networks, kube-router, or OVN	Depends on CNI plugin capabilities; Cilium supports L3 through L7 network policies.
Security policies	NSX Distributed Firewall	Kubernetes Network Policies	Enforced by CNI plugin; not all CNIs support ingress/egress policies. Cilium is the closest NSX like capabilities for KubeVirt
Load balancer access	NSX LB, HAProxy, F5 integration	Kubernetes Services of type LoadBalancer or Ingress controller	External LB integration depends on cloud provider or MetalLB. Cilium offers it’s own ingress controller and LB functions.
DNS and DHCP	Provided via guest tools or PXE boot infra	KubeVirt provides basic DHCP; external DHCP required for L2 bridge/SR-IOV	Can also configure static IPs via cloud-init or guest agent. Typically this is handled by the Kubernetes Cluster, with DNS requests also going to an in-cluster provider, before upstream to your infrastructure/DC’s DNS servers.

Default Pod Network (eth0)

By default, every pod in Kubernetes, including KubeVirt VMs, receives a network interface connected to the cluster’s default CNI (Container Network Interface). This is typically Calico, Cilium, Flannel, or OVN-Kubernetes. This interface, usually named eth0, connects the VM to the same network namespace as Kubernetes pods.

Use case: VMs that need to communicate with services inside the Kubernetes cluster (e.g., microservices, DNS, ingress). By connecting to the Kubernetes CNI, you can typically consume all the features of the CNI for both containers and VMs.
vSphere comparison: Like connecting a VM to a standard vSwitch with no VLAN tagging; it’s purely for internal traffic unless explicitly routed out.

Multus: Secondary Networks

Multus is a Kubernetes CNI meta-plugin that enables a pod (and by extension a VM) to attach to multiple networks. This is how you attach a VM to external or isolated networks beyond the default pod network. Each additional network is defined via a NetworkAttachmentDefinition, referencing a specific CNI plugin (e.g., macvlan, SR-IOV, bridge, or host-device).

Use case: Connecting VMs to external L2/L3 networks, VLANs, or legacy infrastructure.
vSphere comparison: Equivalent to assigning a VM multiple vNICs on different port groups across vSwitches or NSX segments.

Bridge Binding

Bridge mode allows VMs to be connected directly to the host’s network interface using Linux bridge devices. This is often used in bare-metal clusters where the physical NIC is bridged into the VM, offering L2 adjacency to physical networks.

Pros: Simple, no overlay. VMs appear as native devices on the physical network.
Cons: No namespace isolation. Requires careful IPAM configuration.
vSphere comparison: Like assigning a VM to a standard port group on a vSwitch backed by a physical NIC.

SR-IOV: High Performance Networking

For low-latency or high-throughput workloads, KubeVirt supports SR-IOV (Single Root I/O Virtualization), where physical NICs are partitioned into Virtual Functions (VFs) and assigned directly to VMs. This bypasses the kernel and virtual switch entirely, achieving near-native performance.

Use case: NFV, telco workloads, latency-sensitive apps.
Requirements: NICs with SR-IOV support, kernel drivers, and appropriate configuration on the host and Kubernetes.
vSphere comparison: Similar to using DirectPath I/O passthrough or VMXNET3 with SR-IOV enabled in ESXi.

Masquerade, Bridge, and Slirp Modes (Interface Bindings)

When defining interfaces for a VM, you choose a binding method that controls how the VM interface connects to the underlying pod network.

Binding Method	Description	vSphere Analogy
Masquerade	VM gets a private IP, NAT’ed behind the pod’s IP. Useful for typical pod-like behaviour.	NAT-backed network via edge firewall or NSX-T Tier-1 gateway
Bridge	VM gets an IP on the pod’s network. Requires L2 access; works well in bare-metal.	Standard vSwitch access port group
Slirp (User-mode)	QEMU’s user-mode networking. Limited, useful only in developer scenarios.	Like a VM in Workstation with NAT’d local-only networking

Advanced Networking Considerations

IP Address Management (IPAM): VMs connected to the default pod network typically receive an IP automatically via Kubernetes’ internal networking model, with NAT (masquerade) applied by default. Direct static IP assignment on the default network is not straightforward without CNI customization. For secondary networks defined via Multus, IP allocation is handled by the specific CNI plugin (e.g., DHCP, static IPAM, or manual addressing), providing greater control for static or routed network designs.
Firewalling: VMs are subject to Kubernetes Network Policies, if enforced by the CNI plugin. This maps loosely to NSX security groups or DFW rules.
Load Balancing: VMs can be exposed via Kubernetes Services (ClusterIP, NodePort, LoadBalancer) or external ingress controllers.

Firewalling and Microsegmentation

In a vSphere environment, network security is often enforced using NSX Distributed Firewall (DFW) rules or security groups. These allow administrators to define microsegmentation policies between workloads, often based on VM tags, port groups, or other metadata.

In KubeVirt, VMs are subject to Kubernetes Network Policies. These policies control ingress and egress traffic to and from pods—including VMs—based on IP blocks, namespaces, and labels. However, not all CNI plugins enforce these policies at the dataplane level. Some CNIs treat policies as advisory unless explicitly configured with enforcement engines.

Cilium, the only CNCF-graduated CNI with eBPF-based dataplane enforcement, provides a much richer model. It supports L3-L7 network policy enforcement, including fully stateful firewalling, DNS-aware rules, and visibility tooling. For NSX users accustomed to detailed DFW rule sets and East-West traffic control, Cilium comes closest to replicating those capabilities in a Kubernetes-native way—with the added benefit of being programmable via GitOps and integrated with observability pipelines.

Key considerations:

If using a basic CNI (e.g., Flannel), network policies may not be enforced at all.
OVN-Kubernetes supports basic policy enforcement, closer to traditional iptables-based firewalls.
Cilium provides the most NSX-like experience: distributed, stateful, programmable security with full auditability.

Networking Summary

In vSphere, virtual networking is mostly abstracted behind GUI-defined port groups and uplinks. In KubeVirt, you must build those abstractions explicitly using Kubernetes CRDs and CNI plugins. While the learning curve is steeper, the flexibility is significant—you can build complex, programmable networking topologies using just YAML and open-source plugins. Once mastered, this model gives you infrastructure-as-code control over VM connectivity and security in ways traditional hypervisor stacks don’t expose natively.

VM Lifecycle: Creating, Starting, and Operating

In vSphere environments, VM lifecycle management revolves around intuitive GUIs and workflows through vCenter Server. Creating VMs, modifying hardware, and managing templates are standardized and straightforward.

KubeVirt, operating within Kubernetes, provides similar capabilities but exposes them through YAML manifests, Kubernetes APIs, and command-line tools like kubectl and virtctl. While the fundamental tasks remain the same, the operational experience changes significantly, demanding a more declarative, automation-driven mindset.

Common VM Lifecycle Operations: vSphere vs KubeVirt

Task	VMware vSphere	KubeVirt	Notes
Create a new VM	Use vSphere Web Client: New VM Wizard (select compute, storage, network, hardware)	Write a YAML manifest for `VirtualMachine` resource and apply via `kubectl`	VM spec defines CPU, memory, disks, NICs, cloud-init configuration
Clone a VM from template	Clone from template wizard	Use CDI (Containerized Data Importer) to clone PVCs; create VM referencing cloned volume	VM templates can be GitOps-managed (YAML stored in Git repo)
Power on/off a VM	Right-click VM > Power operations	Use `virtctl start vm-name` or `virtctl stop vm-name`	Power state is controlled via Kubernetes custom resources and CLI
Resize (extend) a VM disk	Edit VM settings > Increase VMDK size > Rescan inside OS	Expand the backing PVC if supported by StorageClass, patch VM spec if needed	StorageClass must support volume expansion; filesystem resize is manual inside guest
Create a new network	Create port group on vSwitch or vDS; assign VLAN or VXLAN IDs	Define a `NetworkAttachmentDefinition` (Multus) referencing CNI plugin and configuration	Network definitions are YAML objects stored in Kubernetes. If you are using a CNI that supports multiple IPAM Pools or Networks, there maybe other specific steps needed.
Add a NIC to a VM	Edit VM settings > Add Network Adapter > Connect to port group	Patch the `VirtualMachine` object to add a new `interface` and `network`	NICs are described declaratively; hot-plug requires virt-launcher pod support or restart
Deploy VMs on new shared storage	Add new datastore > Present to hosts > Storage vMotion VMs if needed	Create new `StorageClass` in Kubernetes > Create PVCs from the StorageClass	VMs reference PVCs dynamically provisioned from new StorageClasses
Snapshot a VM	Take VM snapshot (memory + disk or disk only)	Create Kubernetes `VolumeSnapshot` for PVC (if CSI snapshotting is supported)	Snapshots are storage-native; full VM snapshots (memory + CPU state) are more complex

In this video, Michael Leavan show’s how you would upload an Guest OS ISO into the Containerized Data Importer, so that you can build a virtual machine from it.

I’ve included this, as it’s one of the more “complex” tasks to get your head around in terms of how it works, the steps themselves aren’t very complex, but it’s a bit different to vSphere (or maybe you’ll find it similar to using the ovatool to upload into a datastore?)

How VM Definitions Work in KubeVirt

In KubeVirt, a VM is not a “binary” file like a .vmx config in VMware. Instead, it is a declarative YAML object stored in the Kubernetes etcd database, describing the desired state of the VM: CPU, memory, disks, networking, cloud-init setup, and more. These definitions can be version-controlled, deployed via GitOps, and updated dynamically as needed.

Below is an annotated YAML example to help vSphere administrators understand the equivalent concepts:

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: webserver-vm
spec:
  running: true          # (Equivalent to setting VM to "powered on" after creation)
  template:
    metadata:
      labels:
        kubevirt.io/domain: webserver-vm
    spec:
      domain:
        cpu:
          cores: 2        # (VM will have 2 vCPUs, just like setting in vSphere)
        devices:
          disks:
          - name: rootdisk
            disk:
              bus: virtio # (Virtio disk controller, similar to choosing SCSI/Paravirtual in VMware)
          interfaces:
          - name: default
            bridge: {}    # (Attach to default pod network, similar to basic vSwitch access)
          - name: secondary-net
            bridge: {}    # (Attach to an external network via Multus, like adding a second vNIC to another VLAN/portgroup)
        resources:
          requests:
            memory: 4Gi   # (4GB RAM assigned to the VM, similar to vSphere VM memory configuration)
      networks:
      - name: default
        pod: {}           # (Connects to the cluster’s default CNI network, auto-assigned IPs)
      - name: secondary-net
        multus:
          networkName: multus-external-network  # (Secondary network from Multus, like another portgroup)
      volumes:
      - name: rootdisk
        persistentVolumeClaim:
          claimName: webserver-pvc # (Root disk attached via PVC, similar to adding a VMDK from a datastore)

Key points for vSphere admins:

CPU, memory, disk, and NICs are all explicitly defined in YAML, similar to VM hardware settings in the vSphere Web Client.
Networking is declarative: interfaces and network sources must both be defined and matched by name.
Each network interface maps to a different network type (default pod network vs Multus-defined networks).
Storage is attached by referencing a PersistentVolumeClaim (PVC), which must be created beforehand or dynamically provisioned via StorageClass.
Power state is controlled via the running: true or running: false setting.

Example: Adding a New Network Interface to a Running VM

Adding a new NIC in vSphere is a few clicks in the UI. In KubeVirt, you patch the VM definition. Depending on configuration, this may require restarting the VM, unless hotplug support is fully enabled.

kubectl patch vm webserver-vm --type='json' -p='[
  {
    "op": "add",
    "path": "/spec/template/spec/networks/-",
    "value": {
      "name": "secondary-net",
      "multus": {
        "networkName": "multus-secondary-net"
      }
    }
  },
  {
    "op": "add",
    "path": "/spec/template/spec/domain/devices/interfaces/-",
    "value": {
      "name": "secondary-net",
      "bridge": {}
    }
  }
]'

This patch:

Attaches the VM to a secondary Multus-defined network
Creates a new bridge device inside the VM
Is equivalent to adding a vNIC connected to a different port group

Example: Expanding a VM Disk (PVC)

Expanding a VM disk in KubeVirt requires two steps:

Expand the PVC size if the underlying StorageClass allows volume expansion.
Trigger a filesystem resize inside the guest OS manually.

kubectl patch pvc webserver-pvc --patch '
spec:
  resources:
    requests:
      storage: 40Gi
'

After resizing the PVC, you need to rescan and expand the partition/filesystem inside the guest OS, exactly as you would inside a Linux or Windows VM after expanding a VMDK in vSphere.

Operational Mindset Shift

For vSphere admins, KubeVirt introduces a shift away from GUI-driven operations toward API-driven, infrastructure-as-code management. While tasks like deploying a VM or expanding storage remain conceptually similar, the tooling and workflows change significantly. Embracing GitOps, YAML-based configuration, and automation pipelines becomes crucial for efficient day-2 operations at scale.

Tooling and Interacting with KubeVirt

In vSphere, day-to-day operations are performed through the vCenter UI, vSphere Client, or CLI tools like PowerCLI. In KubeVirt, your primary tools are:

kubectl: The standard Kubernetes CLI tool used to interact with cluster resources, including VirtualMachines (VM), VirtualMachineInstances (VMI), PersistentVolumeClaims (PVCs), and other CRDs.
virtctl: A KubeVirt-specific CLI tool that extends kubectl functionality. It provides simplified commands for VM operations like starting, stopping, connecting to the console, and initiating live migrations.

Example operations:

# List all VirtualMachine (VM) objects (whether running or not)
kubectl get vm

# List all active VirtualMachineInstance (VMI) objects (only running VMs)
kubectl get vmi

# Start a VM (transition it from defined to running state)
virtctl start vm-name

# Connect to the console of a running VM (serial console access)
virtctl console vm-name

# Live migrate a running VM to another node in the cluster
virtctl migrate vm-name

The below video from the KubeVirt project shows a practical demo of the command line interactions for running and managing VMs with KubeVirt.

The following operations are shown in the video:

Creating a Virtual Machine from a YAML and a cloud image
Starting the Virtual Machine
Connecting through the console and SSH to the Virtual Machine
Connecting to the Virtual Machine through VNC
Stopping and removing the Virtual Machine

Because KubeVirt objects are just Kubernetes resources, they can be managed using any Kubernetes-native tooling. This naturally leads to GitOps workflows:

Define VMs as YAML: Create VirtualMachine manifests declaratively, specifying CPU, memory, storage, and networking.
Store in Git repository: Version control your VM definitions alongside other cluster resources.
Deploy via Pipelines: Use a GitOps controller (e.g., ArgoCD, Flux) to automatically apply changes to the Kubernetes cluster when a new commit is made.

GitOps makes managing VMs similar to managing Kubernetes deployments, configmaps, or services — giving you full audit trails, code reviews, and rollback capability for your virtual infrastructure.

For vSphere admins, think of this as replacing manual VM deployments or vSphere templates with a fully automated, source-controlled VM lifecycle management system.

Tip: If you're using GitOps, you can automate not just VM creation, but also updates, migrations, and decommissioning through pull requests. 

In many ways, GitOps workflows with KubeVirt mirror what vRealize Automation (Aria Automation) provides in vSphere environments — templated VM definitions, automated deployments, version-controlled change management, and full auditability. 

However, instead of a graphical blueprint designer, you manage VMs declaratively via YAML stored in Git repositories, and delivery is triggered through Git workflows and CI/CD pipelines.

In a GitOps model, infrastructure becomes self-healing and self-documenting by design.

kubectl vs virtctl: When to Use Each Tool

Tool	Primary Use	Example Actions
`kubectl`	Manage Kubernetes resources directly, including VirtualMachines (VM) and VirtualMachineInstances (VMI).	List, create, delete, patch, and inspect VMs or VMIs.
`virtctl`	Perform VM-specific operational actions that are not native Kubernetes verbs.	Start, stop, console access, live migrate VMs.

Day 2 Operations: What Changes?

Once a KubeVirt environment is up and running, Day 2 operations; the management, maintenance, monitoring, troubleshooting, and scaling of VMs, introduce significant differences compared to a traditional vSphere based platform. While many of the fundamental concerns remain (availability, performance, backup, compliance), the tools, patterns, and operational expectations are very different.

Comparison: Day 2 Operations in vSphere vs KubeVirt

Operational Task	VMware vSphere	KubeVirt	Notes
VM Monitoring	vCenter + vRealize Operations Manager	Prometheus metrics, Grafana dashboards, Kubernetes events	Requires Prometheus Operator or custom tooling for dashboards and alerting
Log Management	VM console logs, syslog aggregation (e.g., vRealize Log Insight)	Pod logs via `kubectl logs`, centralized logging with Elasticsearch, Fluentd, and Kibana (EFK) / Prometheus, Loki, and Grafana (PLG) stacks	Each VM’s virt-launcher pod generates logs; guest logs still collected traditionally
Host Maintenance	Enter ESXi into Maintenance Mode; vMotion VMs off	Cordon Kubernetes nodes; live migrate VMIs off node	Requires draining pods or explicitly migrating VMs before node shutdown
Backup and Restore	VM snapshots, VADP-based backups (Veeam, Commvault)	A plugin/software offering such as Velero for cluster resource backup, VolumeSnapshots for storage backup. Kasten is another fantastic offering.	Separate backup of VM resource definitions and underlying PVCs recommended
Patching & Upgrades	vSphere Update Manager (VUM) for ESXi and vCenter upgrades	Kubernetes rolling node upgrades; KubeVirt Operator manages KubeVirt upgrades	Zero-downtime upgrades require careful workload disruption management
Security Management	vSphere Permissions, NSX DFW, VM Encryption	RBAC policies, Kubernetes Network Policies (or CNI based ones such as Cilium Network Policies), PVC encryption at storage layer	Security shifts to Kubernetes-native constructs and storage backend features
Capacity Planning	vROps, vCenter Resource Pools, DRS Clusters	Cluster resource requests/limits, vertical scaling with VM templates	Capacity is managed through scheduling hints and resource quotas

Note: Yes, I used the old VMware names for the cloud management suite rather than the newer Aria Operations, Automation and Logs name.

Monitoring VMs and Infrastructure

In vSphere, you rely on vCenter and vRealize (Aria) Operations Manager for a unified monitoring and alerting view. In KubeVirt, observability requires integrating multiple systems:

Metrics: Exposed via Prometheus from KubeVirt components (virt-controller, virt-handler, virt-launcher)
Logs: Collected from pod logs (virt-launcher pods) and systemd journal on nodes
- Logs from inside the VM instance (Guest OS) itself also need to be considered.
Events: Kubernetes event stream is critical for detecting VM lifecycle anomalies (e.g., migration failures)

Grafana dashboards and custom alert rules are commonly built for VMIs, tracking metrics like:

CPU and memory usage inside VMs (guest metrics if QEMU guest agent is installed)
Migration success/failure rates
Node availability and resource pressure (e.g., disk space, RAM, CPU throttling)

Troubleshooting VMs

Troubleshooting in vSphere typically involves reviewing vCenter alarms, ESXi host logs, and VM event logs. In KubeVirt, troubleshooting spans multiple layers:

Pod-Level Issues: Use kubectl describe pod virt-launcher-xxx and kubectl logs to inspect container startup failures.
VM-Level Issues: Access VM consoles via virtctl console vm-name or use cloud-init logs if configured.
Node-Level Problems: Check virt-handler logs, KVM device status, and node resource availability.

Typical troubleshooting flow:

# List all VirtualMachines, VirtualMachineInstances, and Pods in the namespace
kubectl get vm,vmi,pod -n {namespace}

# Describe the VirtualMachineInstance (VMI) to view detailed status, events, and conditions
kubectl describe vmi my-vm

# Retrieve logs from the virt-launcher pod, focusing on the 'compute' container where the QEMU process runs
kubectl logs virt-launcher-xxx -c compute

# Connect directly to the VM’s serial console for interactive troubleshooting inside the guest OS
virtctl console my-vm

Patch Management and Upgrades

While vSphere provides a GUI-driven Update Manager for patches and upgrades, in KubeVirt:

Kubernetes nodes are patched and upgraded via automated pipelines, this is highly dependant on the Kubernetes platform and/or method that the Kubernetes cluster was deployed by (e.g., Kured, Cluster API, RKE2, or manual drain + upgrade).
KubeVirt itself is upgraded via the virt-operator and a rolling restart of components across the cluster.
Live-migrating VMs off a node during upgrades requires explicitly triggering live migrations or orchestrating node drains with eviction policies tuned for VMI objects.

Security and Compliance

Security responsibility shifts significantly:

RBAC: Kubernetes RBAC controls who can create, modify, delete VMs and associated resources.
Network Security: Enforced through Kubernetes Network Policies (enhanced by CNIs like Cilium or OVN-Kubernetes).
Storage Encryption: At-rest encryption handled by the underlying CSI storage backend, not by KubeVirt itself.
VM Hardening: Done via guest OS security baselines, cloud-init scripts, or immutable VM images.

Of course this is not an exhaustive list of all the security considerations at hand.

Key Operational Changes for vSphere Admins

Declarative over Imperative: Instead of clicking through wizards, you write or patch YAML manifests, therefore you are either interfacing with a CLI, API or a GitOps platform.
Automation First: Infrastructure-as-code and GitOps pipelines are standard for managing VMs and infrastructure changes.
Cluster-Aware Thinking: No central “vCenter” view; resources are API-driven and node state matters for VM scheduling and uptime.
Tool chain Diversity: No single pane of glass by default; observability, backup, and security are composed from multiple open-source tools, or their Enterprise equivalents.

In short, Day 2 operations in KubeVirt demand a shift toward cloud-native operational models, automated, declarative, resilient by design, but many concepts will feel familiar once mapped correctly to Kubernetes primitives, which I’ve aimed and hopefully achieved in this blog post.

Open Source vs Enterprise Support: Real-World Deployment Considerations

KubeVirt is a fully open-source project, licensed under the Apache 2.0 license. Unlike VMware vSphere, there is no built-in commercial support or SLA; community-driven forums, mailing lists, GitHub issues, and public Slack channels are the primary support mechanisms.

Engineers adopting KubeVirt directly from the upstream project are expected to participate in community discussions, troubleshoot independently, and contribute improvements when possible. This open, collaborative model provides flexibility and transparency but demands a different operational mindset and skillset compared to traditional vendor-backed virtualization platforms.

For enterprises seeking commercial support, Red Hat offers OpenShift Virtualization, a productized, supported distribution of KubeVirt tightly integrated into OpenShift. Red Hat is a major contributor to the upstream KubeVirt project and extends it with tested lifecycle management, advanced monitoring, hardened security defaults, and SLA-backed support.

Red Hat’s close involvement ensures upstream innovations land quickly in their enterprise offering, keeping customers aligned with the community roadmap. Other vendors also pair Kubernetes distributions with KubeVirt for VM workloads, but Red Hat is currently considered the market leader in this space. Choosing an enterprise offering like OpenShift Virtualization provides a path to Kubernetes-native VM management without sacrificing the operational assurances organizations expect from traditional platforms like vSphere.

Tip: Red Hat OpenShift Virtualization abstracts much of the underlying Kubernetes and KubeVirt complexity with a polished GUI, powerful automation, and certified integrations, this ideal for teams migrating from VMware looking for operational continuity. However in my experience, most OpenShift consumers don't interact with the GUI.

Conclusion: Understanding the Trade-offs

KubeVirt introduces a fundamentally different operational model for virtualization, one that is deeply integrated into Kubernetes’ cloud-native approach to infrastructure management. For vSphere administrators, this represents both an opportunity and a challenge: familiar concepts like compute, storage, and networking persist, but the workflows, tooling, and abstractions shift towards declarative, API-driven models. Understanding these differences is critical to making an informed decision about adopting KubeVirt alongside, or instead of, traditional vSphere environments.

VMware vSphere: Pros and Cons

Pros:
- Mature, enterprise-grade platform with extensive ecosystem integration (vRealize (Aria), NSX, Tanzu, VCF)
- Highly polished GUI with simple and intuitive management workflows
- Advanced feature set: HA, DRS, vMotion, VM encryption, snapshots, resource scheduling
- Large operational knowledge base and strong vendor support ecosystem
- Robust backup, monitoring, and lifecycle management tooling natively built-in
Cons:
- Significant licensing and operational costs, especially post-Broadcom acquisition shifts
- Platform is less aligned with modern containerized and GitOps-centric workflows
- Scaling operational models (e.g., clusters, upgrades) not as “elastic” as Kubernetes-based systems

KubeVirt: Pros and Cons

Pros:
- Fully integrated into Kubernetes, enabling unified VM and container management under a single platform
- Open-source, extensible, and cost-effective; no proprietary hypervisor licensing
- Supports infrastructure-as-code, GitOps, and automation-first operational models
- Can leverage Kubernetes-native observability, security policies, and dynamic scaling
- Enables cloud-native modernization without immediately abandoning VM-based workloads
Cons:
- Still maturing compared to vSphere’s decades of enterprise features and deep integrations
- Heavier operational complexity; requires strong Kubernetes competency
- Backup, live migration, HA features are available but depend heavily on proper storage and network configurations
- Limited GUI management tools; primarily YAML and API-driven workflows

Key Considerations Before Moving Platforms

Assess your organization’s Kubernetes maturity. KubeVirt assumes familiarity with Kubernetes operators, CRDs, YAML manifests, and API-driven operations.
Evaluate application modernization goals. KubeVirt shines when VMs coexist with or gradually transition into containerized microservices.
Plan for new operational tooling. Monitoring, backup, security, and lifecycle management must be re-architected for a Kubernetes-native environment.
Understand the shift in cost model. Open-source software like KubeVirt reduces licensing costs but increases the need for engineering skills and operational discipline.
Start with hybrid environments. KubeVirt can run alongside vSphere workloads, enabling phased migrations instead of a disruptive “big bang” cutover.

Ultimately, your vSphere expertise provides an excellent foundation for learning and succeeding with KubeVirt. While the management tools, workflows, and mental models differ, the core principles;workload scheduling, resource allocation, networking, storage, and high availability, remain familiar. With the right training and adaptation, KubeVirt can extend your virtualization practice into the Kubernetes-native future, and be a home to replace your VMware platform, if that aligns with your employers goals.

If you have questions about KubeVirt adoption, feel free to leave a comment or reach out — happy to share lessons learned.

Regards

Follow me on Bluesky

Dean Lewis

4 thoughts on “Learn KubeVirt: Deep Dive for VMware vSphere Admins”

Clementine says:

May 11, 2025 at 6:44 pm

Why would anyone go through the hassle of learning to use Kubevirt when VM orchestration through vSphere now comes out of the box with VCF (it’s the Supervisor VM service)? Could you do a comparison of advantages and disadvantages between both?

1. Dean says:
  
  May 13, 2025 at 4:21 pm
  
  Why would anyone learn this? The idea of this post is to map two different platforms together, as there are various reasons why companies are interested in moving from VMware today, and I’ve focused on what I’ve heard the most.
  
  The VM service – if you’re using Tanzu on vSphere, that might be a plausible option to deploy VMs. However, this blog post wasn’t aimed at the users of VM Service, to be honest. It was written for the typical vSphere administrator who has used vSphere as expected for many years, probably never bothered with Tanzu, and is now having to consider another platform option.
  
Champion Nweke says:

May 20, 2025 at 11:58 pm

Thinking of running a Private Cloud Platform.
Would you say KubeVirt is mature enough to run this off?

1. Dean says:
  
  June 29, 2025 at 8:38 pm
  
  Depends on a lot of factors really, I summarised some of these.
  
  There are companies out there doing this today, so it is indeed possible!