This new release of NRI Reference Plugins brings a new NRI plugin, new features in resource policy plugins, a number of bug fixes, end-to-end tests and few use cases in documentation.

What's New

Balloons Policy

Composite balloons enables allocating a diverse set of CPUs for containers with complex CPU requirements. For example, "allocate an equal number of CPUs from both NUMA nodes on CPU socket 0". This allocation enables efficient parallelism inside an AI inference engine container that runs inference on CPU, and still isolate inference engines from each other.
```
balloonTypes:
- name: balance-pkg0-nodes
  components:
  - balloonType: node0
  - balloonType: node1
- name: node0
  preferCloseToDevices:
  - /sys/devices/system/node/node0
- name: node1
  preferCloseToDevices:
  - /sys/devices/system/node/node1
```
Documentation includes recipes for preventing creation of certain containers on a worker node, and resetting CPU and memory pinning of all containers in a cluster.

Topology Aware Policy

Pick CPU and Memory by Topology Hints Normally topology hints are only used to pick the assigned pool for a workload. Once a pool is selected the available resources within the pool are considered equally good for satisfying the topology hints. When the policy is allocating exclusive CPUs and picking pinned memory for the workload, only other potential criteria and attributes are considered for picking the individual resources.

When multiple devices are allocated to a single container, it is possible that this default assumption of all resources within the pool being topologically equal is not true. If a container is allocated misaligned devices, IOW devices with different memory or CPU locality. To overcome this, containers can now be annotated to prefer hint based selection and pinning of CPU and memory resources using the pick-resources-by-hints.resource-policy.nri.io annotation. For example,
```
apiVersion: v1
kind: Pod
metadata:
  name: data-pump
  annotations:
    k8s.v1.cni.cncf.io/networks: sriov-net1
    prefer-isolated-cpus.resource-policy.nri.io/container.ctr0: "true"
    pick-resources-by-hints.resource-policy.nri.io/container.ctr0: "true"
spec:
  containers:
  - name: ctr0
    image: dpdk-pump
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 2
        memory: 100M
        vendor.com/sriov_netdevice_A: '1'
        vendor.com/sriov_netdevice_B: '1'
      limits:
        vendor.com/sriov_netdevice_A: '1'
        vendor.com/sriov_netdevice_B: '1'
        cpu: 2
        memory: 100M
```
When annotated like that, the policy will try to pick one exclusive isolated CPU with locality to one device and another with locality to the other. It will also try to pick and pin to memory aligned with these devices.

Common Policy Improvements

These are improvements to common infrastructure and as such are available for the balloons and topology-aware policy plugins, as well as for the wireframe template policy plugin.

Cache Allocation

Plugins can be configured to exercise class-based control over the L2 and L3 cache allocated to containers' processes. In practice, containers are assigned to classes. Classes have a corresponding cache allocation configuration. This configuration is applied to all containers and subsequently to all processes started in a container. To enable cache control use the control.rdt.enable option which defaults to false.

Plugins can be configured to assign containers by default to a cache class named after the Pod QoS class of the container: one of BestEffort, Burstable, and Guaranteed. The configuration setting controlling this behavior is control.rdt.usagePodQoSAsDefaultClass and it defaults to false.

Additionally, containers can be explicitly annotated to be assigned to a class. Use the rdtclass.resource-policy.nri.io annotation key for this. For instance

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  annotations:
    rdtclass.resource-policy.nri.io/pod: poddefaultclass
    rdtclass.resource-policy.nri.io/container.special-container: specialclass
...

This will assign the container named special-container within the pod to the specialclass RDT class and any other container within the pod to the poddefaultclass RDT class. Effectively these containers' processes will be assigned to the RDT CLOSes corresponding to those classes.

Cache Class/Partitioning Configuration

RDT configuration is supplied as part of thecontrol.rdt configuration block. Here is a sample snippet as a Helm chart value which assigns 33%, 66% and 100% of cache lines to BestEffort, Burstable and Guaranteed Pod QoS class containers correspondingly:

config:
  control:
    rdt:
      enable: true
      usePodQoSAsDefaultClass: true
      options:
        l2:
          optional: true
        l3:
          optional: true
        mb:
          optional: true
      partitions:
        fullCache:
          l2Allocation:
            all:
              unified: 100%
          l3Allocation:
            all:
              unified: 100%
          classes:
            BestEffort:
              l2Allocation:
                all:
                  unified: 33%
              l3Allocation:
                all:
                  unified: 33%
            Burstable:
              l2Allocation:
                all:
                  unified: 66%
              l3Allocation:
                all:
                  unified: 66%
            Guaranteed:
              l2Allocation:
                all:
                  unified: 100%
              l3Allocation:
                all:
                  unified: 100%

Cache Allocation Prerequisites

Note that for cache allocation control to work, you must have

a hardware platform which supports cache allocation
resctrlfs pseudofilesystem enabled in your kernel, and loaded if it is a module
the resctrlfs filesystem mounted (possibly with extra options for your platform)

New plugin: nri-memory-policy

The NRI memory policy plugin sets Linux memory policy for new containers.
The memory policy plugin, for instance, advises kernel to interleave memory pages of a container on all NUMA nodes in the system, or on all NUMA nodes near the same socket where container's allowed CPUs are located.
The plugin works as a stand-alone plugin, and it works together with NRI resource policy plugins and Kubernetes resource managers. It recognizes CPU and memory pinning set by resource management components. The memory policy plugin should be after the resource policy plugins in the NRI plugins chain.
Memory policy for a container is defined in pod annotations.
At the time of NRI plugins release, latest released containerd or CRI-O do not support NRI Linux memory policy adjustments, or NRI container command line adjustments for a workaround. Using this plugin requires a container runtime that is built with NRI version including command line adjustments. (NRI version > 0.9.0)

What's Changed

resmgr,config: allow configuring cache allocation via goresctrl. by @klihub in #541
resmgr: expose RDT metrics. by @klihub in #543
Balloons with components by @askervin in #526
topology-aware: try picking resources by hints first by @klihub in #545
memory-policy: NRI plugin for setting memory policy by @askervin in #517
mempolicy: go interface for set_mempolicy and get_mempolicy syscalls by @askervin in #514
mpolset: get/set memory policy and exec a command by @askervin in #515
topology-aware: fix format of container-exported memsets. by @klihub in #532
resmgr: update container-exported resource data. by @klihub in #537
sysfs: add a helper for gathering whatever IDs related to CPUs by @askervin in #513
sysfs: fix CPU.GetCaches() to not return empty slice. by @klihub in #533
sysfs: export CPUFreq.{Min,Max}. by @klihub in #534
helm: add Chart for memory-policy deployment by @askervin in #519
go.{mod,sum}: use new goresctrl tag v0.9.0. by @klihub in #544
Drop tools.go in favor of native tool directive support in go 1.24 by @fmuyassarov in #535
golang: bump go version to 1.24[.3]. by @klihub in #528

Full Changelog: v0.9.4...v0.10.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.10.0