helm header image

Helm upgrade –reuse-values Fails with Nil Pointer Error After a Chart Version Bump

If you have been running a Helm chart for a while and using --reuse-values to carry your previous configuration forward on upgrades, you may have hit an error like the one below after bumping to a new chart version:
Error: UPGRADE FAILED: template: acme-web-proxy/templates/deployment.yaml:22:15: executing "acme-web-proxy/templates/deployment.yaml" at <include "acme-web-proxy.podAnnotations" .>: error calling include: template: acme-web-proxy/templates/_helpers.tpl:41:71: executing "acme-web-proxy.podAnnotations" at <include (print $.Template.BasePath "/configmap.yaml") .>: error calling include: template: acme-web-proxy/templates/configmap.yaml:18:6: executing "acme-web-proxy/templates/configmap.yaml" at <include "acme-web-proxy.metrics.config" .>: error calling include: template: acme-web-proxy/templates/configmap.yaml:34:25: executing "acme-web-proxy/templates/configmap.yaml" at <.Values.server.metrics.enabled>: nil pointer evaluating interface {}.enabled
Running the same upgrade with an explicit values file works without issue:
helm upgrade -n my-namespace acme-web-proxy acme/web-proxy \
  --version 1.5.0 \
  -f helm/acme-web-proxy-values.yaml
Release "acme-web-proxy" has been upgraded. Happy Helming!
Read on to understand why these two commands behave differently and what you can do about it.

The Issue

When upgrading a Helm chart using --reuse-values, the upgrade fails with a nil pointer error. The error traces back to a template trying to access a values key that does not exist in the stored release values, in this case .Values.server.metrics.enabled. The same upgrade succeeds when you pass a values file explicitly using -f.

The Cause

The difference comes down to how Helm builds the values set that gets rendered into your chart templates. With helm upgrade --reuse-values, Helm takes only the user-supplied values stored from the previous release and uses those as the complete set of overrides. It does not start from the new chart version’s values.yaml defaults. Any key introduced in the new chart version is simply missing. With helm upgrade -f values.yaml, Helm starts from the new chart’s values.yaml defaults and merges your file on top. Keys added in the new chart version are populated with their default values before your overrides are applied. In the example above, chart version 1.5.0 added a new server.metrics.enabled key. The chart template accesses it directly without a nil guard:
{{- if .Values.server.metrics.enabled }}
  # metrics configuration block
{{- end }}
When you upgrade with --reuse-values, the server.metrics map does not exist in the stored values at all. Go’s template engine cannot evaluate .enabled on a nil pointer and the render fails immediately. This is expected behaviour. The Helm documentation states that --reuse-values reuses the last release’s values and merges in any overrides from --set. Merging in new chart defaults is not part of what it does.

The Fix

There are three approaches depending on your workflow.

Option 1: Always upgrade with an explicit values file

In my opinion, this is the most reliable approach. Keep a values file that captures every override you need and pass it on every upgrade:
helm upgrade -n my-namespace acme-web-proxy acme/web-proxy \
  --version 1.5.0 \
  -f helm/acme-web-proxy-values.yaml
Helm loads the new chart’s values.yaml defaults first and then applies your file on top. New keys get their defaults and your existing overrides stay intact.

Option 2: Supply the missing key with –set

If you want to keep using --reuse-values, you can backfill the missing key on the command line. Check the new chart’s values.yaml for the expected default and pass it in:
helm upgrade -n my-namespace acme-web-proxy acme/web-proxy \
  --version 1.5.0 \
  --reuse-values \
  --set server.metrics.enabled=false
This resolves the immediate error, but it is a fragile approach for ongoing upgrades. Each time a new chart version introduces a key that a template does not nil-guard, you will hit the same problem again.

Option 3: Use –reset-then-reuse-values (Helm 3.14+)

Helm 3.14 added the --reset-then-reuse-values flag. It resets to the new chart’s defaults first and then re-applies your previously stored overrides on top:
helm upgrade -n my-namespace acme-web-proxy acme/web-proxy \
  --version 1.5.0 \
  --reset-then-reuse-values
If you are on Helm 3.14 or later, this flag handles the new defaults problem without requiring you to maintain a full values file. You can check your Helm version with helm version.

Why –reuse-values is risky across chart version bumps

--reuse-values was designed for cases where you want to re-apply the same set of overrides without listing them again. It works well when upgrading within the same chart version or when a new version does not introduce any new required template values. Once a chart adds a new key and the template author accesses it without a nil guard such as {{- if .Values.server.metrics }}, any upgrade using --reuse-values will break for anyone who does not have that key in their stored values. It is partly a chart authoring problem, but you will encounter it regardless and need to know how to unblock yourself. The most consistent approach is to treat your values file as the source of truth and always pass -f on every upgrade. Your intent is explicit, the file is reviewable in source control, and you will not get caught out when a chart adds new keys. Regards Follow me on Bluesky Dean Lewis
Cisco Live Header

Bridging Old Hypervisors to Cloud Native Platforms: What You’ll Learn at Cisco Live Amsterdam

Are you planning or already facing a migration from legacy hypervisors to modern, cloud native platforms? The toughest part usually isn’t compute or storage, it’s networking. If static IPs, subnet constraints, and complex topologies are slowing you down, Cisco Live Amsterdam is the place to fix that.

This year, I’m delivering two brand-new sessions focused on making VM migrations faster, safer, and more predictable using Cisco, Isovalent, and cloud native technologies like Kubernetes and KubeVirt.


1. Breakout Session: Taking Away the Network Pain from Cross-Hypervisor Migrations with Isovalent Network Bridge [BRKCLD-1713]

When: Thursday, Feb 12, 1:00 PM – 2:30 PM CET
Session Type: Breakout
Technical Level: Introductory
Technology: Observability, Cloud Native, Data Center
Content: 100% new
Session Link: View BRKCLD-1713 in the Cisco Live catalog

Migrating VMs between hypervisors and across datacenter boundaries is rarely “lift-and-shift.” Most tools help you move configurations and storage, but leave you with the hardest challenge: network accessibility, routing, and security.

This session focuses on removing that network friction using Isovalent Network Bridge, an eBPF-powered solution that enables seamless connectivity between VM and Kubernetes workloads, regardless of where they are placed or migrated to.

In this session you will learn how to:

  • Run virtual machines and containers together using cloud native hypervisors built on Kubernetes and KubeVirt.
  • Marry cloud native networking with your existing datacenter connectivity.
  • Assess migration tooling, risks, and considerations for cross-hypervisor moves.
  • Apply best practices when migrating VMs from legacy hypervisor platforms.
  • Leverage Cisco’s latest networking products to simplify VM network migration.

If you are an infrastructure, networking, or platform engineer and you want to de-risk VM moves without endless re-IP and downtime, this breakout session is for you.

→ Reserve your spot: BRKCLD-1713 – Taking Away the Network Pain from Cross-Hypervisor Migrations


2. Technical Seminar: Being Successful Moving from Legacy Hypervisors to Cloud Native Platforms with Cisco and Isovalent [TECCLD-1773]

When: Monday, Feb 9, 2:15 PM – 6:45 PM CET
Session Type: Technical Seminar
Technical Level: Introductory
Technology: Observability, Cloud Native, Data Center
Content: 100% new
Session Link: View TECCLD-1773 in the Cisco Live catalog

Note: Technical Seminars are priced in addition to your Full Conference or IT Leadership pass and can be added via the Cisco Live registration portal.

This in-depth Technical Seminar is designed for teams who want to build a clear, actionable roadmap from legacy hypervisors to cloud native platforms.

We will compare traditional hypervisors with emerging cloud native hypervisors, walk through Kubernetes networking fundamentals, and show how network and security design decisions directly impact the success of your migration strategy.

We then go deeper into Isovalent Network Bridge and how it preserves network identity to keep workloads reachable and secure, even as they move across VMware, OpenShift Virtualization, and other platforms.

By the end of this seminar, you will:

  • Understand the current state of VM migration between hypervisors and where networking complexity arises.
  • Gain working knowledge of Kubernetes networking and how it applies to virtual machines.
  • Explore strategies to minimize downtime and disruption during migrations.
  • Learn how Isovalent Network Bridge simplifies workload mobility across platforms.
  • See live technical demos of bridging datacenter networks with cloud native networking and security.

This seminar is ideal if you are responsible for VM operations, datacenter networking, or platform engineering and want a practical, end-to-end view of modernizing your stack.

→ Add this seminar to your registration: TECCLD-1773 – Being Successful Moving from Legacy Hypervisors to Cloud Native Platforms


Why You Should Attend Both Sessions

Together, these two sessions provide a complete journey:

  • TECCLD-1773 gives you the big-picture strategy, foundational knowledge, and detailed demos for moving from legacy hypervisors to cloud native platforms.
  • BRKCLD-1713 dives deep into network migration pain points and shows how to solve them with Isovalent Network Bridge and Cisco networking.

If your organization is planning a migration, modernizing your datacenter, or exploring cloud native architectures, these sessions will help you:

  • Reduce risk and downtime during migrations.
  • Avoid costly re-IP and redesign work.
  • Align network, platform, and application teams around a unified approach.

Join Me at Cisco Live Amsterdam

Cisco Live is the perfect place to learn, ask questions, and benchmark your strategy against what others in the industry are doing.

If you want to turn VM migration from a risky project into a repeatable, well-understood process, make sure you add these sessions to your schedule:

  • BRKCLD-1713: Taking Away the Network Pain from Cross-Hypervisor Migrations with Isovalent Network Bridge – View & enroll
  • TECCLD-1773: Being Successful Moving from Legacy Hypervisors to Cloud Native Platforms with Cisco and Isovalent – View & add to your registration

I’m looking forward to meeting you in Amsterdam and diving into how we can make your next migration smoother, safer, and truly cloud native.

If you’d like to stay in touch or discuss your migration plans, feel free to connect with me on LinkedIn.

Kubernetes Header Image

How to Increase CPU & Memory Limits and Set Node Selector for Splunk Operator on Kubernetes

The Issue

When deploying a Splunk instance using the Splunk Operator on Kubernetes, the default resource limits are set to 4 CPUs and 8GB of RAM. Users often want to increase these limits to better utilize available hardware resources. Additionally, users may want to schedule the Splunk pods on a specific Kubernetes node by using a nodeSelector.

However, attempts to set nodeSelector directly in the Splunk Operator’s Custom Resource (CR) manifest result in errors, and the operator does not apply the node selection as expected. This leads to deployment failures or pods not being scheduled on the desired node.

The Cause

The root cause is that the Splunk Operator’s Custom Resource Definition (CRD) for Standalone does not support the nodeSelector field inside the spec section of the CR manifest. When you try to add nodeSelector there, Kubernetes rejects the manifest with errors like:

The request is invalid: patch: Invalid value: ...: strict decoding error: unknown field "spec.nodeSelector"

This happens because nodeSelector is not defined in the manifest according to the CRD schema, and the Splunk Operator currently does not expose nodeSelector as a configurable field in the CR.

The Fix

To increase CPU and memory limits for your Splunk instance, update the resources section under spec in your Splunk Standalone manifest like this:

spec:
  resources:
    limits:
      cpu: "6"              # Max 6 CPUs allowed
      memory: "12Gi"        # Max 12 GB memory allowed

This change is supported by the operator and will apply the resource limits correctly.

For node selection, since the operator does not support setting nodeSelector in the CR, you need to manually patch the StatefulSet that the operator creates. Use the following kubectl patch command to restrict the pods to run only on a specific node (replace the hostname with your target node):

kubectl patch statefulset splunk-splunk-01-standalone -n splunk --type='merge' -p='{
  "spec": {
    "template": {
      "spec": {
        "nodeSelector": {
          "kubernetes.io/hostname": "ip-10-1-1-15.us-east-2.compute.internal"
        }
      }
    }
  }
}'

This patch adds the nodeSelector to the pod template spec of the StatefulSet, ensuring pods are scheduled only on the specified node.

 

cilium header

Cilium Network Policies, from first principles to production

This post teaches the Cilium policy model with clear scenarios and annotated YAML. It matches the style of practical technical blogs, explanation first and code second, with links to the official docs where you will want deeper detail.

Why Cilium policy

Kubernetes’ built-in NetworkPolicy objects define which pods can communicate using label-based rules at the IP and port level. This provides basic isolation, but it stops short of deeper visibility or intent-based control.

Cilium builds on this foundation by introducing security identities derived from labels. These identities represent workloads consistently across nodes and are enforced directly in the kernel using eBPF. Because enforcement happens in the datapath, policies remain accurate and efficient even as workloads scale or IPs change.

Beyond IP and port filtering, Cilium understands application context such as DNS names and HTTP methods and paths. This makes it possible to express policies in human terms — for example, “allow only GET requests on /health from pods with role=frontend” or “allow egress only to api.partner.com.”

Together, these capabilities create a single, consistent model for enforcing and observing network behavior across all workloads. This post walks through that model step by step, with practical YAML examples you can apply to your own environment.

For further reading, see the official Cilium policy overview for the complete language reference and selector options.

Mental model

Every policy answers four things. Where to enforce, which direction to guard, who may talk, and whether to apply checks at the application layer.

These are the top things to keep in mind when defining a Cilium Network Policy.

  • Subject, choose pods with endpointSelector or nodes with nodeSelector
  • Direction, if a selected subject has an ingress list then ingress becomes default deny for that subject, the same idea applies for egress
  • Peers, choose with fromEndpoints, toEndpoints, toEntities, toCIDRSet, toFQDNs, toServices
  • Application layer, add optional rules: under toPorts for HTTP or DNS

Language details are in the Cilium policy language.

We typically refer to the security policies implemented in Cilium holistically as “Cilium Network Policy”. However when you dive into using them in your platform, you will find there is in fact two types of policy configuration to be aware of. Essentially most of the information in this post is true for both types. But just keep in mind the following;

  • CiliumNetworkPolicy (CNP) is the namespaced policy object you apply to control traffic for pods within a single namespace.
    CiliumClusterwideNetworkPolicy (CCNP) is the cluster-scoped version. It uses the same language and selectors but applies across all namespaces, which is useful for node policies, global DNS interception, or rules that span multiple teams.

What a Cilium endpoint is

Every pod (and any process that Cilium manages traffic for) is represented inside Cilium as an endpoint. An endpoint is essentially Cilium’s view of a workload: its labels, Security Identity, policies, and network state.

When you write a policy with an endpointSelector, you’re telling Cilium “apply this rule to the endpoints whose labels match this selector.” Cilium uses that to program the eBPF datapath on the node where each endpoint lives.

You can see endpoints on a node with:

kubectl -n kube-system exec -ti ds/cilium -- cilium endpoint list
Each row in the table is one Cilium endpoint. An endpoint represents a pod or other workload that Cilium is managing on that node. The columns tell you at a glance what Cilium knows about it and how policies apply.
  • ENDPOINT: This is Cilium’s internal endpoint ID on that node. In this case, 24. You’ll use this ID if you run cilium endpoint get 24 for detailed info.

  • POLICY (ingress / egress): Shows whether policy enforcement is active on this endpoint. “Disabled” means there are no Cilium policies selecting it yet for that direction, so all traffic is allowed. Once you create a CiliumNetworkPolicy with an ingress or egress section matching this endpoint, this field will flip to “Enabled”.

  • IDENTITY: The numeric Security Identity assigned to the set of identity-relevant labels for this endpoint (69014 here). Cilium uses this number in the datapath to represent the workload.

  • LABELS (source:key=value): The full list of labels that Cilium knows for this endpoint. The prefix shows where the label came from (k8s: means Kubernetes label). These are the labels you match on in your policies. In the example, it includes app=minio, the namespace, the service account, and some Helm-related labels.

  • IPv4 / IPv6: The IP addresses currently assigned to that pod. Notice you never use them directly in your policies; Cilium maps them to the Security Identity automatically. Note: there is the ability to specify CIDR-based filtering in a Cilium Network Policy as well, but this is recommended not to be used to for filtering when it comes to Pod traffic inside the cluster.

  • STATUS: Shows the endpoint’s state from Cilium’s perspective (“ready” means it’s healthy and being managed).

ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                                                                IPv6   IPv4         STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                                                                 
24         Disabled           Disabled          69014      k8s:app.kubernetes.io/managed-by=Helm                                                             10.0.0.247   ready   
                                                           k8s:app=minio                                                                                                          
                                                           k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=minio                                                   
                                                           k8s:io.cilium.k8s.policy.cluster=kind                                                                                  
                                                           k8s:io.cilium.k8s.policy.serviceaccount=quickstart-sa                                                                  
                                                           k8s:io.kubernetes.pod.namespace=minio                                                                                  
                                                           k8s:v1.min.io/console=quickstart-console                                                                               
                                                           k8s:v1.min.io/pool=ss-0

This view is invaluable when troubleshooting, and we’ll cover this towards the end of the blog post.

Labels and Security Identity

Continue reading Cilium Network Policies, from first principles to production

Kubernetes Header Image

Fixing “Kubernetes configuration file is group-readable or world-readable” warnings

The Issue

When using kubectl or oc you may see warnings that your Kubernetes configuration file is readable by group or by everyone.

WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/user/cluster/admin-kubeconfig
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/user/cluster/admin-kubeconfig

The Cause

The kubeconfig file has permissions that allow access for group or others. The tools expect your kubeconfig to be readable and writable only by your user.

You can confirm this with a long listing. If you see read permission for group or others, the file is too open.

ls -l /home/user/cluster/admin-kubeconfig
-rw-r--r--  1 user  staff   12345  Sep  3 14:05 /home/user/cluster/admin-kubeconfig
# ^ group and others have read access

The Fix

  1. Restrict the file permissions so only your user can read and write it.
    chmod 600 /home/user/cluster/admin-kubeconfig
  2. Optionally restrict the directory that holds the file.
    chmod 700 /home/user/cluster
  3. Verify the new permissions. The output should show owner read and write only.
    ls -l /home/user/cluster/admin-kubeconfig
    -rw-------  1 user  staff   12345  Sep  3 14:05 /home/user/cluster/admin-kubeconfig
    
  4. Consider moving the kubeconfig into your home configuration folder for easier use, then point your tools at it.
    mkdir -p ~/.kube
    mv /home/user/cluster/admin-kubeconfig ~/.kube/admin-kubeconfig
    export KUBECONFIG=~/.kube/admin-kubeconfig
    

    If you work with several kubeconfigs, you can join them in an environment variable.

    export KUBECONFIG=~/.kube/admin-kubeconfig:~/.kube/other.kubeconfig
  5. Keep your kubeconfig private. Do not share it, and do not commit it to a source control system.

Regards


Bluesky Icon
Follow me on Bluesky

Dean Lewis