As a vSphere administrator, you’ve built your career on understanding infrastructure at a granular level, datastores, DRS clusters, vSwitches, and HA configurations. You’re used to managing VMs at scale. Now, you’re hearing about KubeVirt, and while it promises Kubernetes-native VM orchestration, it comes with a caveat: Kubernetes fluency is required. This post is designed to bridge that gap, not only explaining what KubeVirt is, but mapping its architecture, operations, and concepts directly to vSphere terminology and experience. By the end, you’ll have a mental model of KubeVirt that relates to your existing knowledge.
What is KubeVirt?
KubeVirt is a Kubernetes extension that allows you to run traditional virtual machines inside a Kubernetes cluster using the same orchestration primitives you use for containers. Under the hood, it leverages KVM (Kernel-based Virtual Machine) and QEMU to run the VMs (more on that futher down).
Kubernetes doesn’t replace the hypervisor, it orchestrates it. Think of Kubernetes as the vCenter equivalent here: managing the control plane, networking, scheduling, and storage interfaces for the VMs, with KubeVirt as a plugin that adds VM resource types to this environment.
Tip: KubeVirt is under active development; always check latest docs for feature support.
Core Building Blocks of KubeVirt, Mapped to vSphere
KubeVirt Concept | vSphere Equivalent | Description |
---|---|---|
VirtualMachine (CRD) |
VM Object in vCenter | The declarative spec for a VM in YAML. It defines the template, lifecycle behaviour, and metadata. |
VirtualMachineInstance (VMI) |
Running VM Instance | The live instance of a VM, created and managed by Kubernetes. Comparable to a powered-on VM object. |
virt-launcher | ESXi Host Process | A pod wrapper for the VM process. Runs QEMU in a container on the node. |
PersistentVolumeClaim (PVC) | VMFS Datastore + VMDK | Used to back VM disks. For live migration, either ReadWriteMany PVCs or RAW block-mode volumes are required, depending on the storage backend. |
Multus + CNI | vSwitch, Port Groups, NSX | Provides networking to VMs. Multus enables multiple network interfaces. CNIs map to port groups. |
Kubernetes Scheduler | DRS | Schedules pods (including VMIs) across nodes. Lacks fine-tuned VM-aware resource balancing unless extended. |
Live Migration API | vMotion | Live migration of VMIs between nodes with zero downtime. Requires shared storage and certain flags. |
Namespaces | vApp / Folder + Permissions | Isolation boundaries for VMs, including RBAC policies. |
KVM + QEMU: The Hypervisor Stack
Unlike vSphere, which uses the ESXi hypervisor with its tightly integrated kernel modules and management daemons, KubeVirt relies on a Linux-native hypervisor stack consisting of KVM (Kernel-based Virtual Machine), QEMU (Quick Emulator), and libvirt.
Here’s how each component fits into the stack:
- KVM: A kernel module that exposes virtualization extensions (Intel VT-x or AMD-V) to user space, allowing hardware-accelerated VM execution.
- QEMU: The userspace emulator and virtual machine monitor that interfaces with KVM to run guest operating systems.
- libvirt: A daemon and API abstraction layer that manages QEMU and KVM. It handles tasks such as starting, stopping, migrating, and configuring VMs.
libvirt acts as the orchestration layer on each node, similar in concept to how hostd
or vpxa
functions on ESXi hosts. In the KubeVirt model, libvirt is not exposed directly to the user; instead, it’s controlled by the KubeVirt components via the virt-handler
daemon.
Kubernetes Node (Worker Node) +---------------------------------------------------------------+ | KubeVirt Components | | | | +------------------+ +------------------------------+ | | | virt-handler | <--->| libvirtd daemon | | | +------------------+ +------------------------------+ | | | | | | | v | | | +------------------+ | | | | QEMU Process | | | | | (inside pod via | | | | | virt-launcher) | | | | +------------------+ | | | | | | | v | | | Hardware Virtualization | | | (via KVM) | +---------------------------------------------------------------+
Each VM is run inside a special pod called virt-launcher
, which spins up a QEMU process. This QEMU process communicates with libvirt locally on the node. The virt-handler
is the component that talks to Kubernetes and orchestrates the lifecycle of VMIs by issuing commands to libvirt.
Here’s what happens when you launch a VM in KubeVirt:
- You create a
VirtualMachine
resource in Kubernetes. - The
virt-controller
detects this and schedules avirt-launcher
pod to a node. - Once on the node,
virt-handler
takes over and communicates withlibvirt
to start a QEMU process inside thevirt-launcher
pod. - KVM provides the necessary hardware acceleration to the QEMU process.
libvirt is also responsible for features such as live migration, snapshotting, and CPU pinning—though these features must be surfaced via KubeVirt’s API to be usable in a Kubernetes-native way. It acts as the “hypervisor operations backend” for KubeVirt, much like how ESXi performs operations at the host level on behalf of vCenter.
As a vSphere admin, you can think of libvirt as the underlying hypervisor interface that KubeVirt wraps and controls, abstracting away direct hypervisor interaction in favour of Kubernetes-native APIs and workflows.
Storage: PVCs, Block Mode, and Live Migration
In a VMware vSphere environment, storage is managed through abstractions like datastores (VMFS, NFS, vSAN), and virtual disks are provisioned as .vmdk
files. You assign these to VMs and manage availability, performance, and policies via Storage Policy-Based Management (SPBM). You also expect storage to be available to all hosts in a cluster to support features like vMotion and HA.
KubeVirt’s approach to VM storage is fundamentally different. Rather than mounting datastores at the hypervisor level, KubeVirt leverages Kubernetes’ Persistent Volume (PV) and Persistent Volume Claim (PVC) model. These are abstractions that describe how storage is provisioned and consumed by workloads, including VMs. The underlying volume can be block or file-based, provisioned statically or dynamically, and backed by any Kubernetes-compatible CSI (Container Storage Interface) driver.
A Kubernetes CSI (Container Storage Interface) driver acts as the translation layer between Kubernetes and the underlying storage system. It provides the logic for Kubernetes to dynamically provision volumes, attach or detach storage to pods (including VMs via KubeVirt), and handle volume lifecycle operations such as resizing and snapshotting.
In vSphere terms, a CSI driver is similar to how VMware’s Storage APIs (VAAI, VASA) interact with datastores, storage arrays, and vSphere’s Storage Policy Based Management (SPBM). Just as vSphere relies on storage providers (e.g., VASA providers for vSAN, Pure, or NetApp) to orchestrate volume operations behind the scenes, Kubernetes relies on CSI drivers to standardize storage operations across different vendors. Each CSI driver exposes features like dynamic provisioning, volume expansion, and snapshotting based on what the underlying storage platform supports. When deploying KubeVirt, the choice and configuration of the CSI driveris very critical to ensuring VM performance, live migration capability, and data protection strategies.
Mapping Concepts
KubeVirt / Kubernetes | vSphere | Description |
---|---|---|
PersistentVolume (PV) | Datastore (VMFS/NFS) | The underlying physical or network-attached volume available to the cluster. |
PersistentVolumeClaim (PVC) | VMDK File | The storage unit “claimed” by a VM. Acts like a virtual disk or VMDK mapped to the VM. |
StorageClass | Storage Policy | Defines how volumes are provisioned (e.g., thin/thick, replicated), like SPBM in vSphere.
This also defines the CSI driver to use, which is configured to the underlying phsyical storage. |
volumeMode: Filesystem | VMFS, NFS | Mounts a formatted filesystem into the VM pod. Only usable if ReadWriteOnce (RWO) or ReadWriteMany (RWX) access is possible. |
volumeMode: Block | Raw Device Mapping (RDM) | Presents raw block device to VM, similar to RDMs or raw disks in VMware. |
Note: In Kubernetes storage terminology, ReadWriteOnce (RWO) means the volume can be mounted as read-write by only a single node at a time. This is typical for most block storage backends. ReadWriteMany (RWX) means the volume can be mounted as read-write by multiple nodes simultaneously, which is necessary for scenarios like live migration in KubeVirt where both the source and destination nodes need concurrent access to the VM’s disk.
How VMs Access Storage in KubeVirt
When you define a VirtualMachine
in KubeVirt, you attach one or more volumes
backed by PVCs. These PVCs are bound to PVs that represent actual disks in your storage backend—this could be iSCSI, Ceph RBD, NFS, Fibre Channel, Portworx, or any other CSI-compatible storage system.
Volumes are attached inside the virt-launcher
pod, and then passed through to the QEMU process as virtual disks (virtio or SATA, depending on config). These behave similarly to adding a VMDK to a VM’s SCSI or SATA controller in vSphere.
Live Migration: What Changes
In vSphere, vMotion, the ability to migrate a running VM between nodes in a cluster, is possible, as ESXi hosts in a cluster have access to the same shared storage and common networking between them. The VMDK does not need to be moved—only the memory and device state of the VM is transferred across hosts. (There is some finer nuances to this statement, such as in the case of shared nothing vMotion). Ultimately, the same concepts applies in KubeVirt, but with more explicit configuration.
Live migration in KubeVirt requires:
- The VM’s disks must be on storage accessible to both source and destination nodes.
- The PVCs must support either:
- ReadWriteMany (RWX) access mode – typical for CephFS, NFS, or Portworx shared volumes.
- Block-mode volumes (
volumeMode: Block
) – typically backed by Ceph RBD or raw iSCSI volumes.
If your PVC is using the default filesystem mode and only supports ReadWriteOnce (RWO)
, live migration will fail, because the destination node cannot mount the volume while it is in use.
Common Storage Topologies for KubeVirt
Just as vSphere environments are typically designed around shared VMFS datastores, NFS mounts, or vSAN clusters, KubeVirt requires careful planning around how storage is presented to the Kubernetes cluster and its nodes. Below are several common storage topologies used with KubeVirt, with pros, cons, and practical considerations.
This topology is equivalent to mounting an NFS datastore on all ESXi hosts. A shared POSIX-compliant filesystem (a filesystem that follows standard UNIX file operation rules for permissions, locking, and file handling, ensuring compatibility across systems) is mounted and accessed concurrently by multiple nodes.
The volume is presented to Kubernetes with ReadWriteMany
(RWX) mode, allowing VM live migration and multiple pod access.
- Use cases: Simplified backup (file-based), live migration support, multi-node access.
- Pros: Easy to configure, supports snapshots, compatible with most storage backends.
- Cons: Latency-sensitive applications may suffer. Requires tuning to avoid locking issues.
Here, the storage backend provides distributed block storage accessible from multiple nodes. The underlying block device may be replicated across the cluster, and the CSI driver handles the synchronization. This supports both RWX and Block volume modes depending on the CSI features.
- Use cases: Stateful VMs with HA, live migration with block devices, fault tolerance.
- Pros: High performance, supports filesystem or raw block, survives node failure.
- Cons: Requires CSI that supports RWX or replication, more complex configuration.
3. Block Storage (RWO + Block Mode) – e.g. Ceph RBD, iSCSI
This topology is akin to using raw device mappings (RDM) in vSphere. The volume is presented to a single node at a time, as a raw block device, using volumeMode: Block
. Live migration is still possible because the volume is not mounted with a filesystem and can be passed between nodes if the storage supports multi-attachment.
- Use cases: High-throughput workloads, database VMs, VMs requiring live migration without RWX.
- Pros: Low overhead, no filesystem management, good for IOPS-intensive workloads.
- Cons: Only one node can access at a time. Migration depends on backend compatibility.
4. Local Storage (RWO + Filesystem) – e.g. HostPath, LVM, local disks
Each node uses its own local disk or partition, and volumes are tied to the node where they are created. This is like placing a VM on a local-disk-only ESXi host: fast, but not portable.
- Use cases: Dev/test clusters, workloads that don’t require migration or HA.
- Pros: Simple to deploy, good performance, no external dependencies.
- Cons: No live migration, no HA, data loss if node fails.
Manual Migration of a VM Using Local Storage
If a VM in KubeVirt is using local storage (RWO + Filesystem), live migration is not possible because the data is tied to the specific node. To migrate the VM to another node, you must perform a manual process with controlled downtime:
- Shutdown the VM: Gracefully power off the VirtualMachineInstance (VMI) to ensure disk consistency.
virtctl stop vm-name
- Backup or copy the VM disk: Manually copy the PVC data from the source node to a destination that is accessible by the target node. Options include:
- Use
kubectl cp
if HostPath is accessible. - Use
rsync
orscp
directly between nodes for raw disk files (e.g., qcow2 or raw images).
- Use
Recreate the PVC on the target node: Create a new PVC pointing to the new local path on the destination node. Make sure the nodeAffinity
setting binds it to the correct node.</
5. Object Storage for Templates/Import – e.g. S3, MinIO
While not used for VM root disks directly, object storage is often used in conjunction with CDI (Containerized Data Importer) to import VM images, snapshots, or ISOs. Think of it as your vSphere Content Library or image datastore.
- Use cases: Template management, ISO boot media, VM cloning.
- Pros: Decouples storage from compute, easily scalable, supports CI/CD workflows.
- Cons: Requires CDI integration, slower for hot data access.
Storage Design Tip
In vSphere, storage planning focuses on performance tiers and redundancy. In KubeVirt, the same is true, but you also need to consider how storage interacts with Kubernetes primitives like volume claims, access modes, and CSI driver capabilities. Not all backends support ReadWriteMany
, and not all support volumeMode: Block
. Match your topology to your operational goals—especially if you plan to support live migration or automatic rescheduling.
Other Considerations and Notes
- StorageClass defines provisioning behaviour. Thin provisioning, volume expansion, and IOPS caps are controlled here, similar to SPBM policies in vSphere.
- CDI (Containerized Data Importer) is often used to import VM disk images (e.g., QCOW2, ISO) into PVCs. This is your equivalent of cloning a template or deploying from an OVA.
- Snapshots are available if your CSI driver supports them, enabling backups or templated deployments.
- Hot-add and hot-remove of disks is supported for VMIs through Kubernetes patch operations, though not all combinations of backends support it cleanly.
Storage Summary
In vSphere, storage is abstracted and well-integrated into the virtual infrastructure stack, with a rich UI and robust automation. In KubeVirt, storage management is explicit and decoupled via Kubernetes. You gain flexibility, portability, and integration into GitOps workflows, but must also take care in selecting the right accessModes
and volumeMode
for your use case.
If live migration, high availability, or dynamic scaling are requirements, choosing the correct backend and configuring it properly in your StorageClasses and PVCs is essential. Once understood, KubeVirt’s storage model offers similar parallels to vSphere, but exposes more of the underlying plumbing, something platform engineers will appreciate.
Networking: Multus, Bridge Mode, and Pod Network
In vSphere, networking revolves around virtual switches (vSwitches), port groups, VLAN-backed segments, and distributed switches managed via vCenter and optionally NSX. Each VM is assigned one or more virtual NICs (vNICs), which are then connected to port groups that map to physical or overlay networks. KubeVirt has a similar model, but it builds on top of Kubernetes networking primitives, with extra flexibility (and complexity) through the use of plugins like Multus.
VMware vs KubeVirt Networking: Feature Comparison
Networking Feature | VMware vSphere | KubeVirt | Notes |
---|---|---|---|
Basic virtual switch | vSS (Standard Switch) | Pod Network via default CNI (e.g., OVN, Calico) | Default pod network connects VMs similarly to a basic vSwitch |
Distributed virtual switch | vDS (Distributed Switch) | Multus + bridge/macvlan CNI on each node | Multus mimics vDS by attaching VMs to consistent named networks across nodes |
Port group (access VLAN) | Named port group with VLAN ID | NetworkAttachmentDefinition + CNI config (e.g., macvlan with VLAN) | VLAN ID and interface behavior defined in YAML as part of CNI config |
Port group (trunked) | Port group with VLAN trunk range | SR-IOV or DPDK-enabled interfaces with VF VLAN trunking | Depends on SR-IOV NIC driver and host config (e.g., VF passthrough) |
vNIC assignment | GUI or automation (e.g., add vNIC on VM settings) | YAML definition using interfaces + networks in VM spec |
Multiple interfaces defined declaratively; no GUI unless using a dashboard |
MAC address allocation | Auto-assigned or static via vNIC settings | Static or generated in YAML via macAddress field |
Admin must ensure uniqueness if using static assignment |
VLAN tagging | Configured at port group level or within guest | Defined per network interface in CNI plugin config | macvlan/bridge plugins support VLAN via JSON config fields |
NIC passthrough | SR-IOV or DirectPath I/O | SR-IOV with VF assignment to VMs via Multus | Requires SR-IOV CNI plugin and pre-configured VFs on hosts |
Network segmentation (L2/L3) | vLANs, VXLANs (via NSX) | Multiple CNIs, flat or overlay networks, kube-router, or OVN | Depends on CNI plugin capabilities; Cilium supports L3 through L7 network policies. |
Security policies | NSX Distributed Firewall | Kubernetes Network Policies | Enforced by CNI plugin; not all CNIs support ingress/egress policies. Cilium is the closest NSX like capabilities for KubeVirt |
Load balancer access | NSX LB, HAProxy, F5 integration | Kubernetes Services of type LoadBalancer or Ingress controller | External LB integration depends on cloud provider or MetalLB. Cilium offers it’s own ingress controller and LB functions. |
DNS and DHCP | Provided via guest tools or PXE boot infra | KubeVirt provides basic DHCP; external DHCP required for L2 bridge/SR-IOV | Can also configure static IPs via cloud-init or guest agent. Typically this is handled by the Kubernetes Cluster, with DNS requests also going to an in-cluster provider, before upstream to your infrastructure/DC’s DNS servers. |
Default Pod Network (eth0)
By default, every pod in Kubernetes, including KubeVirt VMs, receives a network interface connected to the cluster’s default CNI (Container Network Interface). This is typically Calico, Cilium, Flannel, or OVN-Kubernetes. This interface, usually named eth0
, connects the VM to the same network namespace as Kubernetes pods.
- Use case: VMs that need to communicate with services inside the Kubernetes cluster (e.g., microservices, DNS, ingress). By connecting to the Kubernetes CNI, you can typically consume all the features of the CNI for both containers and VMs.
- vSphere comparison: Like connecting a VM to a standard vSwitch with no VLAN tagging; it’s purely for internal traffic unless explicitly routed out.
Multus: Secondary Networks
Multus is a Kubernetes CNI meta-plugin that enables a pod (and by extension a VM) to attach to multiple networks. This is how you attach a VM to external or isolated networks beyond the default pod network. Each additional network is defined via a NetworkAttachmentDefinition
, referencing a specific CNI plugin (e.g., macvlan, SR-IOV, bridge, or host-device).
- Use case: Connecting VMs to external L2/L3 networks, VLANs, or legacy infrastructure.
- vSphere comparison: Equivalent to assigning a VM multiple vNICs on different port groups across vSwitches or NSX segments.
Bridge Binding
Bridge mode allows VMs to be connected directly to the host’s network interface using Linux bridge devices. This is often used in bare-metal clusters where the physical NIC is bridged into the VM, offering L2 adjacency to physical networks.
- Pros: Simple, no overlay. VMs appear as native devices on the physical network.
- Cons: No namespace isolation. Requires careful IPAM configuration.
- vSphere comparison: Like assigning a VM to a standard port group on a vSwitch backed by a physical NIC.
SR-IOV: High Performance Networking
For low-latency or high-throughput workloads, KubeVirt supports SR-IOV (Single Root I/O Virtualization), where physical NICs are partitioned into Virtual Functions (VFs) and assigned directly to VMs. This bypasses the kernel and virtual switch entirely, achieving near-native performance.
- Use case: NFV, telco workloads, latency-sensitive apps.
- Requirements: NICs with SR-IOV support, kernel drivers, and appropriate configuration on the host and Kubernetes.
- vSphere comparison: Similar to using DirectPath I/O passthrough or VMXNET3 with SR-IOV enabled in ESXi.
Masquerade, Bridge, and Slirp Modes (Interface Bindings)
When defining interfaces for a VM, you choose a binding
method that controls how the VM interface connects to the underlying pod network.
Binding Method | Description | vSphere Analogy |
---|---|---|
Masquerade | VM gets a private IP, NAT’ed behind the pod’s IP. Useful for typical pod-like behaviour. | NAT-backed network via edge firewall or NSX-T Tier-1 gateway |
Bridge | VM gets an IP on the pod’s network. Requires L2 access; works well in bare-metal. | Standard vSwitch access port group |
Slirp (User-mode) | QEMU’s user-mode networking. Limited, useful only in developer scenarios. | Like a VM in Workstation with NAT’d local-only networking |
Advanced Networking Considerations
- IP Address Management (IPAM): VMs connected to the default pod network typically receive an IP automatically via Kubernetes’ internal networking model, with NAT (masquerade) applied by default. Direct static IP assignment on the default network is not straightforward without CNI customization. For secondary networks defined via Multus, IP allocation is handled by the specific CNI plugin (e.g., DHCP, static IPAM, or manual addressing), providing greater control for static or routed network designs.
- Firewalling: VMs are subject to Kubernetes Network Policies, if enforced by the CNI plugin. This maps loosely to NSX security groups or DFW rules.
- Load Balancing: VMs can be exposed via Kubernetes Services (ClusterIP, NodePort, LoadBalancer) or external ingress controllers.
Firewalling and Microsegmentation
In a vSphere environment, network security is often enforced using NSX Distributed Firewall (DFW) rules or security groups. These allow administrators to define microsegmentation policies between workloads, often based on VM tags, port groups, or other metadata.
In KubeVirt, VMs are subject to Kubernetes Network Policies. These policies control ingress and egress traffic to and from pods—including VMs—based on IP blocks, namespaces, and labels. However, not all CNI plugins enforce these policies at the dataplane level. Some CNIs treat policies as advisory unless explicitly configured with enforcement engines.
Cilium, the only CNCF-graduated CNI with eBPF-based dataplane enforcement, provides a much richer model. It supports L3-L7 network policy enforcement, including fully stateful firewalling, DNS-aware rules, and visibility tooling. For NSX users accustomed to detailed DFW rule sets and East-West traffic control, Cilium comes closest to replicating those capabilities in a Kubernetes-native way—with the added benefit of being programmable via GitOps and integrated with observability pipelines.
Key considerations:
- If using a basic CNI (e.g., Flannel), network policies may not be enforced at all.
- OVN-Kubernetes supports basic policy enforcement, closer to traditional iptables-based firewalls.
- Cilium provides the most NSX-like experience: distributed, stateful, programmable security with full auditability.
Networking Summary
In vSphere, virtual networking is mostly abstracted behind GUI-defined port groups and uplinks. In KubeVirt, you must build those abstractions explicitly using Kubernetes CRDs and CNI plugins. While the learning curve is steeper, the flexibility is significant—you can build complex, programmable networking topologies using just YAML and open-source plugins. Once mastered, this model gives you infrastructure-as-code control over VM connectivity and security in ways traditional hypervisor stacks don’t expose natively.
VM Lifecycle: Creating, Starting, and Operating
In vSphere environments, VM lifecycle management revolves around intuitive GUIs and workflows through vCenter Server. Creating VMs, modifying hardware, and managing templates are standardized and straightforward.
KubeVirt, operating within Kubernetes, provides similar capabilities but exposes them through YAML manifests, Kubernetes APIs, and command-line tools like kubectl
and virtctl
. While the fundamental tasks remain the same, the operational experience changes significantly, demanding a more declarative, automation-driven mindset.
Common VM Lifecycle Operations: vSphere vs KubeVirt
Task | VMware vSphere | KubeVirt | Notes |
---|---|---|---|
Create a new VM | Use vSphere Web Client: New VM Wizard (select compute, storage, network, hardware) | Write a YAML manifest for VirtualMachine resource and apply via kubectl |
VM spec defines CPU, memory, disks, NICs, cloud-init configuration |
Clone a VM from template | Clone from template wizard | Use CDI (Containerized Data Importer) to clone PVCs; create VM referencing cloned volume | VM templates can be GitOps-managed (YAML stored in Git repo) |
Power on/off a VM | Right-click VM > Power operations | Use virtctl start vm-name or virtctl stop vm-name |
Power state is controlled via Kubernetes custom resources and CLI |
Resize (extend) a VM disk | Edit VM settings > Increase VMDK size > Rescan inside OS | Expand the backing PVC if supported by StorageClass, patch VM spec if needed | StorageClass must support volume expansion; filesystem resize is manual inside guest |
Create a new network | Create port group on vSwitch or vDS; assign VLAN or VXLAN IDs | Define a NetworkAttachmentDefinition (Multus) referencing CNI plugin and configuration |
Network definitions are YAML objects stored in Kubernetes. If you are using a CNI that supports multiple IPAM Pools or Networks, there maybe other specific steps needed. |
Add a NIC to a VM | Edit VM settings > Add Network Adapter > Connect to port group | Patch the VirtualMachine object to add a new interface and network |
NICs are described declaratively; hot-plug requires virt-launcher pod support or restart |
Deploy VMs on new shared storage | Add new datastore > Present to hosts > Storage vMotion VMs if needed | Create new StorageClass in Kubernetes > Create PVCs from the StorageClass |
VMs reference PVCs dynamically provisioned from new StorageClasses |
Snapshot a VM | Take VM snapshot (memory + disk or disk only) | Create Kubernetes VolumeSnapshot for PVC (if CSI snapshotting is supported) |
Snapshots are storage-native; full VM snapshots (memory + CPU state) are more complex |
In this video, Michael Leavan show’s how you would upload an Guest OS ISO into the Containerized Data Importer, so that you can build a virtual machine from it.
I’ve included this, as it’s one of the more “complex” tasks to get your head around in terms of how it works, the steps themselves aren’t very complex, but it’s a bit different to vSphere (or maybe you’ll find it similar to using the ovatool
to upload into a datastore?)
How VM Definitions Work in KubeVirt
In KubeVirt, a VM is not a “binary” file like a .vmx
config in VMware. Instead, it is a declarative YAML object stored in the Kubernetes etcd database, describing the desired state of the VM: CPU, memory, disks, networking, cloud-init setup, and more. These definitions can be version-controlled, deployed via GitOps, and updated dynamically as needed.
Below is an annotated YAML example to help vSphere administrators understand the equivalent concepts:
apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: name: webserver-vm spec: running: true # (Equivalent to setting VM to "powered on" after creation) template: metadata: labels: kubevirt.io/domain: webserver-vm spec: domain: cpu: cores: 2 # (VM will have 2 vCPUs, just like setting in vSphere) devices: disks: - name: rootdisk disk: bus: virtio # (Virtio disk controller, similar to choosing SCSI/Paravirtual in VMware) interfaces: - name: default bridge: {} # (Attach to default pod network, similar to basic vSwitch access) - name: secondary-net bridge: {} # (Attach to an external network via Multus, like adding a second vNIC to another VLAN/portgroup) resources: requests: memory: 4Gi # (4GB RAM assigned to the VM, similar to vSphere VM memory configuration) networks: - name: default pod: {} # (Connects to the cluster’s default CNI network, auto-assigned IPs) - name: secondary-net multus: networkName: multus-external-network # (Secondary network from Multus, like another portgroup) volumes: - name: rootdisk persistentVolumeClaim: claimName: webserver-pvc # (Root disk attached via PVC, similar to adding a VMDK from a datastore)
Key points for vSphere admins:
- CPU, memory, disk, and NICs are all explicitly defined in YAML, similar to VM hardware settings in the vSphere Web Client.
- Networking is declarative: interfaces and network sources must both be defined and matched by name.
- Each network interface maps to a different network type (default pod network vs Multus-defined networks).
- Storage is attached by referencing a PersistentVolumeClaim (PVC), which must be created beforehand or dynamically provisioned via StorageClass.
- Power state is controlled via the
running: true
orrunning: false
setting.
Example: Adding a New Network Interface to a Running VM
Adding a new NIC in vSphere is a few clicks in the UI. In KubeVirt, you patch the VM definition. Depending on configuration, this may require restarting the VM, unless hotplug support is fully enabled.
kubectl patch vm webserver-vm --type='json' -p='[ { "op": "add", "path": "/spec/template/spec/networks/-", "value": { "name": "secondary-net", "multus": { "networkName": "multus-secondary-net" } } }, { "op": "add", "path": "/spec/template/spec/domain/devices/interfaces/-", "value": { "name": "secondary-net", "bridge": {} } } ]'
This patch:
- Attaches the VM to a secondary Multus-defined network
- Creates a new
bridge
device inside the VM - Is equivalent to adding a vNIC connected to a different port group
Example: Expanding a VM Disk (PVC)
Expanding a VM disk in KubeVirt requires two steps:
- Expand the PVC size if the underlying StorageClass allows volume expansion.
- Trigger a filesystem resize inside the guest OS manually.
kubectl patch pvc webserver-pvc --patch ' spec: resources: requests: storage: 40Gi '
After resizing the PVC, you need to rescan and expand the partition/filesystem inside the guest OS, exactly as you would inside a Linux or Windows VM after expanding a VMDK in vSphere.
Operational Mindset Shift
For vSphere admins, KubeVirt introduces a shift away from GUI-driven operations toward API-driven, infrastructure-as-code management. While tasks like deploying a VM or expanding storage remain conceptually similar, the tooling and workflows change significantly. Embracing GitOps, YAML-based configuration, and automation pipelines becomes crucial for efficient day-2 operations at scale.
Tooling and Interacting with KubeVirt
In vSphere, day-to-day operations are performed through the vCenter UI, vSphere Client, or CLI tools like PowerCLI. In KubeVirt, your primary tools are:
- kubectl: The standard Kubernetes CLI tool used to interact with cluster resources, including VirtualMachines (VM), VirtualMachineInstances (VMI), PersistentVolumeClaims (PVCs), and other CRDs.
- virtctl: A KubeVirt-specific CLI tool that extends kubectl functionality. It provides simplified commands for VM operations like starting, stopping, connecting to the console, and initiating live migrations.
Example operations:
# List all VirtualMachine (VM) objects (whether running or not) kubectl get vm # List all active VirtualMachineInstance (VMI) objects (only running VMs) kubectl get vmi # Start a VM (transition it from defined to running state) virtctl start vm-name # Connect to the console of a running VM (serial console access) virtctl console vm-name # Live migrate a running VM to another node in the cluster virtctl migrate vm-name
The below video from the KubeVirt project shows a practical demo of the command line interactions for running and managing VMs with KubeVirt.
The following operations are shown in the video:
- Creating a Virtual Machine from a YAML and a cloud image
- Starting the Virtual Machine
- Connecting through the console and SSH to the Virtual Machine
- Connecting to the Virtual Machine through VNC
- Stopping and removing the Virtual Machine
Because KubeVirt objects are just Kubernetes resources, they can be managed using any Kubernetes-native tooling. This naturally leads to GitOps workflows:
- Define VMs as YAML: Create VirtualMachine manifests declaratively, specifying CPU, memory, storage, and networking.
- Store in Git repository: Version control your VM definitions alongside other cluster resources.
- Deploy via Pipelines: Use a GitOps controller (e.g., ArgoCD, Flux) to automatically apply changes to the Kubernetes cluster when a new commit is made.
GitOps makes managing VMs similar to managing Kubernetes deployments, configmaps, or services — giving you full audit trails, code reviews, and rollback capability for your virtual infrastructure.
For vSphere admins, think of this as replacing manual VM deployments or vSphere templates with a fully automated, source-controlled VM lifecycle management system.
Tip: If you're using GitOps, you can automate not just VM creation, but also updates, migrations, and decommissioning through pull requests. In many ways, GitOps workflows with KubeVirt mirror what vRealize Automation (Aria Automation) provides in vSphere environments — templated VM definitions, automated deployments, version-controlled change management, and full auditability. However, instead of a graphical blueprint designer, you manage VMs declaratively via YAML stored in Git repositories, and delivery is triggered through Git workflows and CI/CD pipelines. In a GitOps model, infrastructure becomes self-healing and self-documenting by design.
kubectl vs virtctl: When to Use Each Tool
Tool | Primary Use | Example Actions |
---|---|---|
kubectl |
Manage Kubernetes resources directly, including VirtualMachines (VM) and VirtualMachineInstances (VMI). | List, create, delete, patch, and inspect VMs or VMIs. |
virtctl |
Perform VM-specific operational actions that are not native Kubernetes verbs. | Start, stop, console access, live migrate VMs. |
Day 2 Operations: What Changes?
Once a KubeVirt environment is up and running, Day 2 operations; the management, maintenance, monitoring, troubleshooting, and scaling of VMs, introduce significant differences compared to a traditional vSphere based platform. While many of the fundamental concerns remain (availability, performance, backup, compliance), the tools, patterns, and operational expectations are very different.
Comparison: Day 2 Operations in vSphere vs KubeVirt
Operational Task | VMware vSphere | KubeVirt | Notes |
---|---|---|---|
VM Monitoring | vCenter + vRealize Operations Manager | Prometheus metrics, Grafana dashboards, Kubernetes events | Requires Prometheus Operator or custom tooling for dashboards and alerting |
Log Management | VM console logs, syslog aggregation (e.g., vRealize Log Insight) | Pod logs via kubectl logs , centralized logging with Elasticsearch, Fluentd, and Kibana (EFK) / Prometheus, Loki, and Grafana (PLG) stacks |
Each VM’s virt-launcher pod generates logs; guest logs still collected traditionally |
Host Maintenance | Enter ESXi into Maintenance Mode; vMotion VMs off | Cordon Kubernetes nodes; live migrate VMIs off node | Requires draining pods or explicitly migrating VMs before node shutdown |
Backup and Restore | VM snapshots, VADP-based backups (Veeam, Commvault) | A plugin/software offering such as Velero for cluster resource backup, VolumeSnapshots for storage backup. Kasten is another fantastic offering. | Separate backup of VM resource definitions and underlying PVCs recommended |
Patching & Upgrades | vSphere Update Manager (VUM) for ESXi and vCenter upgrades | Kubernetes rolling node upgrades; KubeVirt Operator manages KubeVirt upgrades | Zero-downtime upgrades require careful workload disruption management |
Security Management | vSphere Permissions, NSX DFW, VM Encryption | RBAC policies, Kubernetes Network Policies (or CNI based ones such as Cilium Network Policies), PVC encryption at storage layer | Security shifts to Kubernetes-native constructs and storage backend features |
Capacity Planning | vROps, vCenter Resource Pools, DRS Clusters | Cluster resource requests/limits, vertical scaling with VM templates | Capacity is managed through scheduling hints and resource quotas |
Note: Yes, I used the old VMware names for the cloud management suite rather than the newer Aria Operations, Automation and Logs name.
Monitoring VMs and Infrastructure
In vSphere, you rely on vCenter and vRealize (Aria) Operations Manager for a unified monitoring and alerting view. In KubeVirt, observability requires integrating multiple systems:
- Metrics: Exposed via Prometheus from KubeVirt components (virt-controller, virt-handler, virt-launcher)
- Logs: Collected from pod logs (
virt-launcher
pods) and systemd journal on nodes- Logs from inside the VM instance (Guest OS) itself also need to be considered.
- Events: Kubernetes event stream is critical for detecting VM lifecycle anomalies (e.g., migration failures)
Grafana dashboards and custom alert rules are commonly built for VMIs, tracking metrics like:
- CPU and memory usage inside VMs (guest metrics if QEMU guest agent is installed)
- Migration success/failure rates
- Node availability and resource pressure (e.g., disk space, RAM, CPU throttling)
Troubleshooting VMs
Troubleshooting in vSphere typically involves reviewing vCenter alarms, ESXi host logs, and VM event logs. In KubeVirt, troubleshooting spans multiple layers:
- Pod-Level Issues: Use
kubectl describe pod virt-launcher-xxx
andkubectl logs
to inspect container startup failures. - VM-Level Issues: Access VM consoles via
virtctl console vm-name
or use cloud-init logs if configured. - Node-Level Problems: Check
virt-handler
logs, KVM device status, and node resource availability.
Typical troubleshooting flow:
# List all VirtualMachines, VirtualMachineInstances, and Pods in the namespace kubectl get vm,vmi,pod -n {namespace} # Describe the VirtualMachineInstance (VMI) to view detailed status, events, and conditions kubectl describe vmi my-vm # Retrieve logs from the virt-launcher pod, focusing on the 'compute' container where the QEMU process runs kubectl logs virt-launcher-xxx -c compute # Connect directly to the VM’s serial console for interactive troubleshooting inside the guest OS virtctl console my-vm
Patch Management and Upgrades
While vSphere provides a GUI-driven Update Manager for patches and upgrades, in KubeVirt:
- Kubernetes nodes are patched and upgraded via automated pipelines, this is highly dependant on the Kubernetes platform and/or method that the Kubernetes cluster was deployed by (e.g., Kured, Cluster API, RKE2, or manual drain + upgrade).
- KubeVirt itself is upgraded via the virt-operator and a rolling restart of components across the cluster.
- Live-migrating VMs off a node during upgrades requires explicitly triggering live migrations or orchestrating node drains with eviction policies tuned for VMI objects.
Security and Compliance
Security responsibility shifts significantly:
- RBAC: Kubernetes RBAC controls who can create, modify, delete VMs and associated resources.
- Network Security: Enforced through Kubernetes Network Policies (enhanced by CNIs like Cilium or OVN-Kubernetes).
- Storage Encryption: At-rest encryption handled by the underlying CSI storage backend, not by KubeVirt itself.
- VM Hardening: Done via guest OS security baselines, cloud-init scripts, or immutable VM images.
Of course this is not an exhaustive list of all the security considerations at hand.
Key Operational Changes for vSphere Admins
- Declarative over Imperative: Instead of clicking through wizards, you write or patch YAML manifests, therefore you are either interfacing with a CLI, API or a GitOps platform.
- Automation First: Infrastructure-as-code and GitOps pipelines are standard for managing VMs and infrastructure changes.
- Cluster-Aware Thinking: No central “vCenter” view; resources are API-driven and node state matters for VM scheduling and uptime.
- Tool chain Diversity: No single pane of glass by default; observability, backup, and security are composed from multiple open-source tools, or their Enterprise equivalents.
In short, Day 2 operations in KubeVirt demand a shift toward cloud-native operational models, automated, declarative, resilient by design, but many concepts will feel familiar once mapped correctly to Kubernetes primitives, which I’ve aimed and hopefully achieved in this blog post.
Open Source vs Enterprise Support: Real-World Deployment Considerations
KubeVirt is a fully open-source project, licensed under the Apache 2.0 license. Unlike VMware vSphere, there is no built-in commercial support or SLA; community-driven forums, mailing lists, GitHub issues, and public Slack channels are the primary support mechanisms.
Engineers adopting KubeVirt directly from the upstream project are expected to participate in community discussions, troubleshoot independently, and contribute improvements when possible. This open, collaborative model provides flexibility and transparency but demands a different operational mindset and skillset compared to traditional vendor-backed virtualization platforms.
For enterprises seeking commercial support, Red Hat offers OpenShift Virtualization, a productized, supported distribution of KubeVirt tightly integrated into OpenShift. Red Hat is a major contributor to the upstream KubeVirt project and extends it with tested lifecycle management, advanced monitoring, hardened security defaults, and SLA-backed support.
Red Hat’s close involvement ensures upstream innovations land quickly in their enterprise offering, keeping customers aligned with the community roadmap. Other vendors also pair Kubernetes distributions with KubeVirt for VM workloads, but Red Hat is currently considered the market leader in this space. Choosing an enterprise offering like OpenShift Virtualization provides a path to Kubernetes-native VM management without sacrificing the operational assurances organizations expect from traditional platforms like vSphere.
Tip: Red Hat OpenShift Virtualization abstracts much of the underlying Kubernetes and KubeVirt complexity with a polished GUI, powerful automation, and certified integrations, this ideal for teams migrating from VMware looking for operational continuity. However in my experience, most OpenShift consumers don't interact with the GUI.
Conclusion: Understanding the Trade-offs
KubeVirt introduces a fundamentally different operational model for virtualization, one that is deeply integrated into Kubernetes’ cloud-native approach to infrastructure management. For vSphere administrators, this represents both an opportunity and a challenge: familiar concepts like compute, storage, and networking persist, but the workflows, tooling, and abstractions shift towards declarative, API-driven models. Understanding these differences is critical to making an informed decision about adopting KubeVirt alongside, or instead of, traditional vSphere environments.
VMware vSphere: Pros and Cons
- Pros:
- Mature, enterprise-grade platform with extensive ecosystem integration (vRealize (Aria), NSX, Tanzu, VCF)
- Highly polished GUI with simple and intuitive management workflows
- Advanced feature set: HA, DRS, vMotion, VM encryption, snapshots, resource scheduling
- Large operational knowledge base and strong vendor support ecosystem
- Robust backup, monitoring, and lifecycle management tooling natively built-in
- Cons:
- Significant licensing and operational costs, especially post-Broadcom acquisition shifts
- Platform is less aligned with modern containerized and GitOps-centric workflows
- Scaling operational models (e.g., clusters, upgrades) not as “elastic” as Kubernetes-based systems
KubeVirt: Pros and Cons
- Pros:
- Fully integrated into Kubernetes, enabling unified VM and container management under a single platform
- Open-source, extensible, and cost-effective; no proprietary hypervisor licensing
- Supports infrastructure-as-code, GitOps, and automation-first operational models
- Can leverage Kubernetes-native observability, security policies, and dynamic scaling
- Enables cloud-native modernization without immediately abandoning VM-based workloads
- Cons:
- Still maturing compared to vSphere’s decades of enterprise features and deep integrations
- Heavier operational complexity; requires strong Kubernetes competency
- Backup, live migration, HA features are available but depend heavily on proper storage and network configurations
- Limited GUI management tools; primarily YAML and API-driven workflows
Key Considerations Before Moving Platforms
- Assess your organization’s Kubernetes maturity. KubeVirt assumes familiarity with Kubernetes operators, CRDs, YAML manifests, and API-driven operations.
- Evaluate application modernization goals. KubeVirt shines when VMs coexist with or gradually transition into containerized microservices.
- Plan for new operational tooling. Monitoring, backup, security, and lifecycle management must be re-architected for a Kubernetes-native environment.
- Understand the shift in cost model. Open-source software like KubeVirt reduces licensing costs but increases the need for engineering skills and operational discipline.
- Start with hybrid environments. KubeVirt can run alongside vSphere workloads, enabling phased migrations instead of a disruptive “big bang” cutover.
Ultimately, your vSphere expertise provides an excellent foundation for learning and succeeding with KubeVirt. While the management tools, workflows, and mental models differ, the core principles;workload scheduling, resource allocation, networking, storage, and high availability, remain familiar. With the right training and adaptation, KubeVirt can extend your virtualization practice into the Kubernetes-native future, and be a home to replace your VMware platform, if that aligns with your employers goals.
If you have questions about KubeVirt adoption, feel free to leave a comment or reach out — happy to share lessons learned.
Regards