Kubernetes Scheduling: nodeSelector vs nodeAffinity

When deploying workloads in Kubernetes, controlling where your pods land is crucial. Two primary mechanisms facilitate this: nodeSelector and nodeAffinity. While they might seem similar at first glance, they serve different purposes and offer varying degrees of flexibility.

The Basics: nodeSelector

The nodeSelector is the simplest way to constrain pods to specific nodes. It matches pods to nodes based on key-value pairs. For instance:

spec:
  nodeSelector:
    disktype: ssd

This configuration ensures that the pod is scheduled only on nodes labeled with disktype=ssd.

However, nodeSelector has its limitations. It doesn’t support complex queries or multiple values for a single key. If you attempt to specify multiple values for the same key, like so:

nodeSelector:
  topology.kubernetes.io/zone: us-east-1a
  topology.kubernetes.io/zone: us-east-1b

Only the last key-value pair is considered, effectively ignoring the previous ones. This behavior stems from the fact that YAML maps require unique keys, and Kubernetes doesn’t merge these entries.

Enter nodeAffinity

For more granular control, nodeAffinity comes into play. It allows you to define rules using operators like In, NotIn, Exists, and DoesNotExist. This flexibility enables you to match pods to nodes based on a range of criteria.

Suppose you want to schedule a pod on nodes in either us-east-1a or us-east-1b. Here’s how you’d achieve that with nodeAffinity:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: topology.kubernetes.io/zone
          operator: In
          values:
          - us-east-1a
          - us-east-1b

This configuration ensures that the pod is scheduled only on nodes in the specified zones.

Moreover, nodeAffinity supports both hard and soft constraints:

Hard constraints (requiredDuringSchedulingIgnoredDuringExecution): The scheduler must enforce these rules.
Soft constraints (preferredDuringSchedulingIgnoredDuringExecution): The scheduler tries to enforce these rules but can relax them if necessary.

Examples of Hard and Soft Constraints

Hard Constraint Example (requiredDuringSchedulingIgnoredDuringExecution):

This configuration ensures that the pod must be scheduled on nodes in either us-east-1a or us-east-1b. If no nodes match, the pod will stay in a pending state.

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: topology.kubernetes.io/zone
          operator: In
          values:
          - us-east-1a
          - us-east-1b

Soft Constraint Example (preferredDuringSchedulingIgnoredDuringExecution):

This configuration expresses a preference to schedule the pod on nodes in either us-east-1a or us-east-1b, but it’s not mandatory. The pod will still be scheduled even if no matching nodes are available.

affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      preference:
        matchExpressions:
        - key: topology.kubernetes.io/zone
          operator: In
          values:
          - us-east-1a
          - us-east-1b

Use hard constraints when placement is critical for functionality or compliance. Use soft constraints when you want to guide the scheduler but still allow flexibility in pod placement.

When to Use What?

Scenario	`nodeSelector`	`nodeAffinity`
Simple, exact match	✅	✅
Multiple values for a single key	❌	✅
Complex expressions (e.g., `NotIn`, `Exists`)	❌	✅
Soft preferences	❌	✅

In summary, use nodeSelector for straightforward scenarios. For more complex scheduling requirements, nodeAffinity provides the necessary flexibility.

Regards

Follow me on Bluesky

Dean Lewis

vEducate.co.uk

Fixing issues and blogging

Leave a Reply Cancel reply