Kubernetes as AI’s operating system: 1.35 release signals

Posted on February 23, 2026 by Angel Ramirez, CEO of Cuemby and CNCF Ambassador

CNCF projects highlighted in this post

Why v1.35 reads like an AI-infrastructure release

Kubernetes has become the place where teams coordinate mixed production workloads: services, batch jobs, data pipelines, and ML training. The Kubernetes v1.35 (“Timbernetes”) release reinforces that trajectory with changes that reduce operational friction in scheduling, resource control, and configuration workflows.

What stands out in v1.35 is practical: fewer restarts for resizing, new primitives for coordinated placement, and safer defaults for how teams generate and review manifests at scale.

Taken together, these updates point to a Kubernetes control plane that’s adapting to bursty jobs, tightly coupled training runs, and continuously tuned inference services. Teams operating mixed clusters tend to feel the pressure first in placement efficiency, resize churn, and configuration review hygiene. The rest of this piece focuses on the v1.35 changes that ease those pressures and make AI/ML operations more predictable at scale.

The changes that matter for AI/ML operations

Workload-aware scheduling arrives (alpha)

Kubernetes v1.35 introduces the workload API and workload-aware scheduling, along with an initial implementation of gang scheduling for “all-or-nothing” placement across a group of Pods. This helps distributed training and tightly coupled jobs avoid partial placement patterns that waste capacity and stall progress.

If you want the deeper design context, the upstream proposal lives in the gang scheduling KEP.

In-place Pod resize is stable

v1.35 graduates in-place Pod resource resize to Stable. CPU and memory adjustments can happen without restarting containers, which reduces churn in inference services that need fast tuning under load and improves recovery options for long-running workloads.

The upstream spec is captured in the in-place update of Pod resources KEP.

Device allocation keeps moving toward a baseline capability

Dynamic Resource Allocation (DRA) is treated as a core building block for device-aware orchestration, and v1.35 keeps DRA enabled as part of the platform’s ongoing work in this area (see the DRA section in the v1.35 release notes). For AI/ML teams, the release continues ongoing efforts towards more predictable device claims and a cleaner path to richer scheduling semantics over accelerators.

KYAML becomes the default kubectl output format

Kubernetes is also tightening the “last mile” of configuration workflows. With v1.35, kubectl output defaults to KYAML, a stricter subset designed to reduce ambiguous YAML behaviors and common formatting hazards. The canonical description is in the KYAML reference.

If you need compatibility testing or controlled rollouts, the upstream design and toggles are documented in the KYAML KEP (including KUBECTL_KYAML=false).

Why AI keeps pushing teams toward a shared operating layer

Production AI systems combine workloads with fundamentally different operating profiles: bursty training jobs, steady inference services, and pipelines that feed both. The common requirement is a consistent operational surface for scheduling, scaling, governance, and policy enforcement across teams and environments.

That pressure is increasing. Gartner projects over 40% of agentic AI projects will be canceled by the end of 2027, driven by escalating costs, unclear business value, and inadequate risk controls. Teams that want durable outcomes need repeatable paths to production that make cost, reliability, and governance measurable.

Platform engineering implications

The teams that scale AI programs usually standardize “how AI ships” across the organization. Internal platforms and golden paths help encode guardrails without blocking iteration:

curated patterns for training and inference
policy-as-code for quotas, access, and controls
self-service workflows with auditability

v1.35’s direction—workload-aware scheduling primitives, stable in-place resize, and safer configuration output—supports platform teams that want to reduce bespoke infrastructure work while keeping Kubernetes as the consistent substrate.

Ecosystem note: Ingress NGINX retirement timeline

Ingress NGINX is in best-effort maintenance until March 2026, after which it will be retired with no further releases, bug fixes, or security updates. The retirement announcement and operational implications are captured in Ingress NGINX Retirement: What You Need to Know and reinforced by the Kubernetes Steering and Security statement.

For operators, this is a planning item: inventory current usage, define a migration path, validate in staging, and document rollback expectations.

Practical evaluation steps for v1.35

If you run AI/ML workloads on Kubernetes:

Pilot workload-aware scheduling on one distributed training workflow in non-production.
Adopt in-place Pod resize where it reduces restart churn and improves tuning loops.
Review KYAML behavior using the KYAML reference and the KYAML KEP before enforcing it in CI examples and templates.
Align ingress/controller roadmaps with the Ingress NGINX retirement timeline.

Kubernetes v1.35 improves the parts of the platform that tend to break first under AI load: coordinated placement, resource control with less disruption, and safer configuration output. Teams that treat Kubernetes as the shared operating layer for AI gain a simpler path to scale because the platform absorbs more of the operational complexity over time.

About the author

Angel Ramirez

Angel Ramirez is a CNCF Ambassador and Kubestronaut with over 17 years of experience in cloud-native architecture and platform engineering. He focuses on operationalizing Kubernetes for AI and enterprise workloads and contributes to community discussions around platform standardization and governance.

Mumbai, India