Tag Archives: Hubble

cilium header

Cilium Network Policies, from first principles to production

This post teaches the Cilium policy model with clear scenarios and annotated YAML. It matches the style of practical technical blogs, explanation first and code second, with links to the official docs where you will want deeper detail.

Why Cilium policy

Kubernetes’ built-in NetworkPolicy objects define which pods can communicate using label-based rules at the IP and port level. This provides basic isolation, but it stops short of deeper visibility or intent-based control.

Cilium builds on this foundation by introducing security identities derived from labels. These identities represent workloads consistently across nodes and are enforced directly in the kernel using eBPF. Because enforcement happens in the datapath, policies remain accurate and efficient even as workloads scale or IPs change.

Beyond IP and port filtering, Cilium understands application context such as DNS names and HTTP methods and paths. This makes it possible to express policies in human terms — for example, “allow only GET requests on /health from pods with role=frontend” or “allow egress only to api.partner.com.”

Together, these capabilities create a single, consistent model for enforcing and observing network behavior across all workloads. This post walks through that model step by step, with practical YAML examples you can apply to your own environment.

For further reading, see the official Cilium policy overview for the complete language reference and selector options.

Mental model

Every policy answers four things. Where to enforce, which direction to guard, who may talk, and whether to apply checks at the application layer.

These are the top things to keep in mind when defining a Cilium Network Policy.

  • Subject, choose pods with endpointSelector or nodes with nodeSelector
  • Direction, if a selected subject has an ingress list then ingress becomes default deny for that subject, the same idea applies for egress
  • Peers, choose with fromEndpoints, toEndpoints, toEntities, toCIDRSet, toFQDNs, toServices
  • Application layer, add optional rules: under toPorts for HTTP or DNS

Language details are in the Cilium policy language.

We typically refer to the security policies implemented in Cilium holistically as “Cilium Network Policy”. However when you dive into using them in your platform, you will find there is in fact two types of policy configuration to be aware of. Essentially most of the information in this post is true for both types. But just keep in mind the following;

  • CiliumNetworkPolicy (CNP) is the namespaced policy object you apply to control traffic for pods within a single namespace.
    CiliumClusterwideNetworkPolicy (CCNP) is the cluster-scoped version. It uses the same language and selectors but applies across all namespaces, which is useful for node policies, global DNS interception, or rules that span multiple teams.

What a Cilium endpoint is

Every pod (and any process that Cilium manages traffic for) is represented inside Cilium as an endpoint. An endpoint is essentially Cilium’s view of a workload: its labels, Security Identity, policies, and network state.

When you write a policy with an endpointSelector, you’re telling Cilium “apply this rule to the endpoints whose labels match this selector.” Cilium uses that to program the eBPF datapath on the node where each endpoint lives.

You can see endpoints on a node with:

kubectl -n kube-system exec -ti ds/cilium -- cilium endpoint list
Each row in the table is one Cilium endpoint. An endpoint represents a pod or other workload that Cilium is managing on that node. The columns tell you at a glance what Cilium knows about it and how policies apply.
  • ENDPOINT: This is Cilium’s internal endpoint ID on that node. In this case, 24. You’ll use this ID if you run cilium endpoint get 24 for detailed info.

  • POLICY (ingress / egress): Shows whether policy enforcement is active on this endpoint. “Disabled” means there are no Cilium policies selecting it yet for that direction, so all traffic is allowed. Once you create a CiliumNetworkPolicy with an ingress or egress section matching this endpoint, this field will flip to “Enabled”.

  • IDENTITY: The numeric Security Identity assigned to the set of identity-relevant labels for this endpoint (69014 here). Cilium uses this number in the datapath to represent the workload.

  • LABELS (source:key=value): The full list of labels that Cilium knows for this endpoint. The prefix shows where the label came from (k8s: means Kubernetes label). These are the labels you match on in your policies. In the example, it includes app=minio, the namespace, the service account, and some Helm-related labels.

  • IPv4 / IPv6: The IP addresses currently assigned to that pod. Notice you never use them directly in your policies; Cilium maps them to the Security Identity automatically. Note: there is the ability to specify CIDR-based filtering in a Cilium Network Policy as well, but this is recommended not to be used to for filtering when it comes to Pod traffic inside the cluster.

  • STATUS: Shows the endpoint’s state from Cilium’s perspective (“ready” means it’s healthy and being managed).

ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                                                                IPv6   IPv4         STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                                                                 
24         Disabled           Disabled          69014      k8s:app.kubernetes.io/managed-by=Helm                                                             10.0.0.247   ready   
                                                           k8s:app=minio                                                                                                          
                                                           k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=minio                                                   
                                                           k8s:io.cilium.k8s.policy.cluster=kind                                                                                  
                                                           k8s:io.cilium.k8s.policy.serviceaccount=quickstart-sa                                                                  
                                                           k8s:io.kubernetes.pod.namespace=minio                                                                                  
                                                           k8s:v1.min.io/console=quickstart-console                                                                               
                                                           k8s:v1.min.io/pool=ss-0

This view is invaluable when troubleshooting, and we’ll cover this towards the end of the blog post.

Labels and Security Identity

Continue reading Cilium Network Policies, from first principles to production

Cilium Event Types Header

Understanding cilium_event_type when using Cilium & Hubble

The Issue

In a platform that’s deployed with Cilium, when using Hubble either to view the full JSON output or to configure which events are captured using the allowlist or denylist you may have seen a field called event_type which uses an integer.

Below is an example allow list using “event_type”, to define which flows to be captured. When I first saw this, I was confused; where do these numbers come from? How do I map this back to a friendly name that I understand?;

allowlist:
- '{"source_pod":["kube-system/"],"event_type":[{"type":1}]}'
- '{"destination_pod":["kube-system/"],"event_type":[{"type":1}]}'

Example Hubble Dynamic Exporter configuration;

hubble:
  export:
    dynamic:
      enabled: true
      config:
        enabled: true
        content:
        - name: "test001"
          filePath: "/var/run/cilium/hubble/test001.log"
          fieldMask: []
          includeFilters: []
          excludeFilters: []
          end: "2023-10-09T23:59:59-07:00"
        - name: "test002"
          filePath: "/var/run/cilium/hubble/test002.log"
          fieldMask: ["source.namespace", "source.pod_name", "destination.namespace", "destination.pod_name", "verdict"]
          includeFilters:
          - source_pod: ["default/"]
            event_type:
            - type: 1
          - destination_pod: ["frontend/webserver-975996d4c-7hhgt"]

and finally, a Hubble flow in full JSON output, with the event_type showing towards the end of the output;

{
  "flow": {
    "time": "2024-07-08T10:09:24.173232166Z",
    "uuid": "755b0203-d456-452d-b399-4fa136cdb4fd",
    "verdict": "FORWARDED",
    "ethernet": {
      "source": "06:29:73:4e:0a:c5",
      "destination": "26:50:d8:4a:94:d2"
    },
    "IP": {
      "source": "10.0.2.163",
      "destination": "130.211.198.204",
      "ipVersion": "IPv4"
    },
    "l4": {
      "TCP": {
        "source_port": 37736,
        "destination_port": 443,
        "flags": {
          "PSH": true,
          "ACK": true
        }
      }
    },
    "source": {
      "ID": 2045,
      "identity": 14398,
      "namespace": "endor",
      "labels": [
        "k8s:app.kubernetes.io/name=tiefighter"
      ],
      "pod_name": "tiefighter-6b56bdc869-2t6wn",
      "workloads": [
        {
          "name": "tiefighter",
          "kind": "Deployment"
        }
      ]
    },
    "destination": {
      "identity": 16777217,
      "labels": [
        "cidr:130.211.198.204/32",
        "reserved:world"
      ]
    },
    "Type": "L3_L4",
    "node_name": "kind-worker",
    "destination_names": [
      "disney.com"
    ],
    "event_type": {
      "type": 4,
      "sub_type": 3
    },
    "traffic_direction": "EGRESS",
    "trace_observation_point": "TO_STACK",
    "is_reply": false,
    "Summary": "TCP Flags: ACK, PSH"
  },
  "node_name": "kind-worker",
  "time": "2024-07-08T10:09:24.173232166Z"
}
The Explanation

Cilium Event types are defined in this Go package. The first line iota == 0 then increments by one for each type, so drop =1, debug =2, etc.

const (
	// 0-128 are reserved for BPF datapath events
	MessageTypeUnspec = iota

	// MessageTypeDrop is a BPF datapath notification carrying a DropNotify
	// which corresponds to drop_notify defined in bpf/lib/drop.h
	MessageTypeDrop

	// MessageTypeDebug is a BPF datapath notification carrying a DebugMsg
	// which corresponds to debug_msg defined in bpf/lib/dbg.h
	MessageTypeDebug

	// MessageTypeCapture is a BPF datapath notification carrying a DebugCapture
	// which corresponds to debug_capture_msg defined in bpf/lib/dbg.h
	MessageTypeCapture

	// MessageTypeTrace is a BPF datapath notification carrying a TraceNotify
	// which corresponds to trace_notify defined in bpf/lib/trace.h
	MessageTypeTrace

	// MessageTypePolicyVerdict is a BPF datapath notification carrying a PolicyVerdictNotify
	// which corresponds to policy_verdict_notify defined in bpf/lib/policy_log.h
	MessageTypePolicyVerdict

	// MessageTypeRecCapture is a BPF datapath notification carrying a RecorderCapture
	// which corresponds to capture_msg defined in bpf/lib/pcap.h
	MessageTypeRecCapture

	// MessageTypeTraceSock is a BPF datapath notification carrying a TraceNotifySock
	// which corresponds to trace_sock_notify defined in bpf/lib/trace_sock.h
	MessageTypeTraceSock

	// 129-255 are reserved for agent level events

	// MessageTypeAccessLog contains a pkg/proxy/accesslog.LogRecord
	MessageTypeAccessLog = 129

	// MessageTypeAgent is an agent notification carrying a AgentNotify
	MessageTypeAgent = 130
)

const (
	MessageTypeNameDrop          = "drop"
	MessageTypeNameDebug         = "debug"
	MessageTypeNameCapture       = "capture"
	MessageTypeNameTrace         = "trace"
	MessageTypeNameL7            = "l7"
	MessageTypeNameAgent         = "agent"
	MessageTypeNamePolicyVerdict = "policy-verdict"
	MessageTypeNameRecCapture    = "recorder"
	MessageTypeNameTraceSock     = "trace-sock"
)

Therefore, in the above JSON output (last example), event type 4 is defined as trace, this particular event type also has a sub_typeas you can see here in the Hubble CLI, help output. You can see the definitions in the Go package here.

  -t, --type filter                         Filter by event types TYPE[:SUBTYPE]. Available types and subtypes:
                                            TYPE             SUBTYPE
                                            capture          n/a
                                            drop             n/a
                                            l7               n/a
                                            policy-verdict   n/a
                                            trace            from-endpoint
                                                             from-host
                                                             from-network
                                                             from-overlay
                                                             from-proxy
                                                             from-stack
                                                             to-endpoint
                                                             to-host
                                                             to-network
                                                             to-overlay
                                                             to-proxy
                                                             to-stack
                                            trace-sock       n/a

I hope this helps!

Regards

Dean Lewis

Cilium Hubble CLI - Header Image

Cilium Hubble CLI – Configure Auto Completion

One of the little nits I have is when I use the terminal, and as I start typing a command if I press the Tab key, autocomplete doesn’t work. It feels like it should be the default out of the box.

You can configure the Hubble CLI for Cilium, but it’s not documented in the docs.cilium.io pages yet, so I thought I’d throw up a quick post adding it here!

This command pushes the auto-complete config into my zsh config on macOS.

hubble completion zsh > $(brew --prefix)/share/zsh/site-functions/_hubble

For other platforms, you can see the examples provided for Cilium Agent, and apply the logic to your own environment.

Cilium Hubble CLI - Autocompletion

Regards

Dean Lewis

Cilium Hubble CLI - Header Image

Cilium Hubble CLI – Using a local configuration file

Did you know that the Cilium Hubble CLI supports using a configuration file?

Below is an example command where Isovalent Enterprise for Cilium is deployed and Hubble RBAC is configured. Therefore, I must provide additional details such as the server location and certificates to authenticate using the CLI. The steps in this blog post also work with Cilium OSS, which is especially handy when setting allow and deny lists to prune the information returned.

This can become cumbersome for every command you want to run.

❯ hubble observe \
--server tls://localhost:4245 \
--tls-ca-cert-files ca-cert.pem \
--tls-server-name 'cli.hubble-relay.cilium.io' \
--namespace kube-system
Mar 20 13:09:38.061: tenant-jobs/resumes-58c6678bc8-5nkcg:36459 (ID:88195) -> kube-system/coredns-77fcb74c4c-wfw4f:53 (ID:102385) policy-verdict:L3-L4 EGRESS ALLOWED (UDP)
Mar 20 13:09:38.061: tenant-jobs/resumes-58c6678bc8-5nkcg:36459 (ID:88195) -> kube-system/coredns-77fcb74c4c-wfw4f:53 (ID:102385) to-proxy FORWARDED (UDP)
Mar 20 13:09:38.062: tenant-jobs/resumes-58c6678bc8-5nkcg (ID:88195) <> kube-system/coredns-77fcb74c4c-wfw4f:53 (ID:102385) post-xlate-fwd TRANSLATED (UDP)
Mar 20 13:09:38.062: tenant-jobs/resumes-58c6678bc8-5nkcg:36459 (ID:88195) -> kube-system/coredns-77fcb74c4c-wfw4f:53 (ID:102385) dns-request proxy FORWARDED (DNS Query coreapi.tenant-jobs.svc.cluster.local. A)

Below you can see the various configuration options that the Hubble CLI supports. The above example is using flags as part of the command.

hubble config -h
Config allows to modify or view the hubble configuration. Global hubble options
can be set via flags, environment variables or a configuration file. The
following precedence order is used:

1. Flag
2. Environment variable
3. Configuration file
4. Default value

The "config view" subcommand provides a merged view of the configuration. The
"config set" and "config reset" subcommand modify values in the configuration
file.

Environment variable names start with HUBBLE_ followed by the flag name
capitalized where eventual dashes ('-') are replaced by underscores ('_').
For example, the environment variable that corresponds to the "--server" flag
is HUBBLE_SERVER. The environment variable for "--tls-allow-insecure" is
HUBBLE_TLS_ALLOW_INSECURE and so on.

Usage:
  hubble config [flags]
  hubble config [command]

Available Commands:
  get         Get an individual value in the hubble config file
  reset       Reset all or an individual value in the hubble config file
  set         Set an individual value in the hubble config file
  view        Display merged configuration settings

Using the below commands, I can set the flags as values in the configuration file, for any CLI flag, the set value will be prepended with HUBBLE_+ the flag name.

❯ hubble config set HUBBLE_SERVER tls://localhost:4245
unknown key: HUBBLE_SERVER
❯ hubble config set server tls://localhost:4245
❯ hubble config set tls-ca-cert-files ca-cert.pem

❯ hubble config set tls-server-name 'cli.hubble-relay.cilium.io'

Now we can use the Hubble CLI without the additional flags.

❯ hubble observe -n tenant-jobs
Mar 20 13:13:30.004: tenant-jobs/coreapi-6748664db6-rmr2j:42935 (ID:111121) <- kube-system/coredns-77fcb74c4c-wfw4f:53 (ID:102385) to-endpoint FORWARDED (UDP)
Mar 20 13:13:30.496: tenant-jobs/crawler-6dbf4f8b5d-vr7gr:47804 (ID:71705) -> tenant-jobs/loader-68544b8b87-zrxwt:50051 (ID:115137) http-request FORWARDED (HTTP/2 POST http://loader:50051/loader.Loader/LoadCv)
Mar 20 13:13:30.505: tenant-jobs/crawler-6dbf4f8b5d-vr7gr:47804 (ID:71705) <- tenant-jobs/loader-68544b8b87-zrxwt:50051 (ID:115137) http-response FORWARDED (HTTP/2 200 9ms (POST http://loader:50051/loader.Loader/LoadCv))

We can validate the configuration in use by running the below command, which also confirms the location of the config file itself, which you can edit directly.

❯ hubble config view
allowlist: []
client-id: ""
client-secret: ""
config: /Users/veducate/Library/Application Support/hubble/config.yaml
debug: false
denylist: []
grant-type: auto
issuer: ""
issuer-ca: ""
refresh: false
scopes: []
server: tls://localhost:4245
timeout: 5s
tls: false
tls-allow-insecure: false
tls-ca-cert-files:
- ca-cert.pem
tls-client-cert-file: ""
tls-client-key-file: ""
tls-server-name: cli.hubble-relay.cilium.io
token-file

Regards

Dean Lewis

Tanzu Kubernetes Grid Cilium Header

How to Deploy a Tanzu Kubernetes Grid cluster using the Cilium CNI

In this blog post I’m going to dive into how you can create a Tanzu Kubernetes Grid cluster and specify your own container network interface, for example, Cilium. Expanding on the installation, I’ll also cover installing a load balancer service, deploying a demo app, and showing some of the observability feature as well.

What is Cilium?
Cilium is an open source software for providing, securing and observing network connectivity between container workloads - cloud native, and fueled by the revolutionary Kernel technology eBPF

Let’s unpack that from the official website marketing tag line.

Cilium is a container network interface for Kubernetes and other container platforms (apparently there are others still out there!), which provides the cluster networking functionality. It goes one step further than other CNIs commonly used, by using a Linux Kernel software technology called eBPF, and allows for the insertion of security, visibility, and networking control logic into the Linux kernel of your container nodes.

Below is a high-level overview of the features.

TKG Cilium - Features

And a high-level architecture overview.

Cilium Architecture

Is it supported to run Cilium in Tanzu Kubernetes cluster?

Tanzu Kubernetes Grid allows you to bring your own Kubernetes CNI to the cluster as part of the Cluster bring-up. You will be required to take extra steps to build a cluster during this type of deployment, as described below in this blog post.

As for support for a CNI outside of Calico and Antrea, you as the customer/consumer must provide that. If you are using Cilium for example, then you can gain enterprise level support for the CNI, from the likes of Isovalent.

Recording

How to deploy a Tanzu Kubernete Cluster with Cilium

Before we get started, we need to download the Cilium CLI tool, which is used to install Cilium into our cluster.

The below command downloads and installs the latest stable version to your /usr/local/bin location. You can find more options here. Continue reading How to Deploy a Tanzu Kubernetes Grid cluster using the Cilium CNI