Kubernetes v1.36 Revamps Memory QoS: Tiered Protection and Opt-In Reservation Bring Precision to Container Memory Management

From Wwwspill, the free encyclopedia of technology

Kubernetes v1.36 Introduces Tiered Memory Protection with Opt-In Reservation

On behalf of SIG Node, the Kubernetes community has announced significant updates to the Memory QoS feature in version 1.36, now in alpha. The enhancement introduces tiered memory protection based on pod QoS classes and a separate opt-in reservation policy, giving administrators finer control over how memory is allocated and reclaimed under pressure.

Kubernetes v1.36 Revamps Memory QoS: Tiered Protection and Opt-In Reservation Bring Precision to Container Memory Management

"Memory QoS uses the cgroup v2 memory controller to provide the kernel better guidance on container memory treatment," said a SIG Node representative. "Version 1.36 separates throttling from reservation, allowing clusters to adopt memory.high throttling first without the risks of hard reservations."

What Changed in v1.36: Separation of Concerns

Previously, enabling the MemoryQoS feature gate immediately set memory.min for every container with a memory request—a hard reservation the kernel would never reclaim. This could lock up large portions of node memory, leaving little headroom for system daemons or BestEffort workloads.

"In earlier versions, if Burstable pods requested 7 GiB on a node with 8 GiB RAM, that 7 GiB was locked as memory.min, increasing OOM kill risk," explained the representative. "Now, with tiered reservation, only Guaranteed pods use memory.min; Burstable pods get soft protection via memory.low, which the kernel can reclaim under extreme pressure."

Opt-In Memory Reservation Controlled by memoryReservationPolicy

Throttling via memory.high (default factor 0.9) is enabled by the feature gate, but reservation is now configured separately through the kubelet field memoryReservationPolicy:

  • None (default): No memory.min or memory.low written. Throttling still works.
  • TieredReservation: Writes tiered memory protection based on QoS class:
    • Guaranteed pods get hard protection via memory.min. Example: For a 512 MiB request, /sys/.../memory.min contains 536870912. The kernel will not reclaim this memory; if unable to honor, it triggers OOM killer on other processes.
    • Burstable pods get soft protection via memory.low. Under normal pressure, kernel avoids reclaim; under extreme pressure, may reclaim to prevent system-wide OOM.
    • BestEffort pods get neither, remaining fully reclaimable.

"With memoryReservationPolicy, you can enable throttling first, observe workload behavior, and opt into reservation when your node has enough headroom," said the SIG Node team.

New Observability Metrics

Two alpha-stability metrics are now exposed on the kubelet /metrics endpoint:

  • kubelet_memory_qos_node_memory_min_bytes: Total memory.min set across all pods.
  • kubelet_memory_qos_node_memory_low_bytes: Total memory.low set across all pods.

These metrics allow administrators to monitor the aggregate hard and soft reservations on each node, aiding capacity planning and debugging.

Kernel Version Warning for memory.high

Version 1.36 adds a warning if the host kernel version does not support memory.high, helping prevent silent misconfigurations.

Background: Evolution of Memory QoS in Kubernetes

The Memory QoS feature was first introduced as alpha in Kubernetes v1.22 and updated in v1.27. The v1.27 implementation applied memory.min to every container with a memory request, regardless of QoS class—a hard guarantee that could starve other workloads.

"The community recognized that a one-size-fits-all reservation approach was too rigid," stated a Kubernetes contributor. "The v1.36 redesign aligns kernel memory protection with the intended QoS hierarchy: Guaranteed pods get hard guarantees, Burstable get soft protection, and BestEffort get none."

What This Means for Administrators

This update gives cluster operators granular control over memory behavior. By enabling ThrottlingOnly first, they can evaluate how memory.high affects workload performance without committing to hard reservations. Once confident, they can opt into TieredReservation to protect critical pods.

"The tiered approach reduces the risk of unintentional OOM kills in mixed-workload clusters," said a site reliability engineer. "For example, a node running both latency-sensitive Guaranteed apps and bursty batch jobs can now have more predictable memory pressure handling."

Administrators should test the new configuration in non-production environments first. The feature remains alpha, meaning flags and APIs may change before graduation.

Looking Ahead

Future releases may expand observability and add support for more granular control. The community encourages feedback on the Kubernetes SIG Node mailing list and issue tracker.

"We believe these changes make Memory QoS safer and more practical for production clusters," concluded the SIG Node representative. "The separation of throttling and reservation is a key step toward robust memory management."