Introduction

April in London has never felt so electric. From the first footstep in the ExCeL halls to the hallway conversations, KubeCon + CloudNativeCon Europe 2025 was a whirlwind of new ideas, familiar faces, and those “aha” moments we all live for. As a first-time in-person speaker—and at my very first conference—I arrived with equal parts excitement and nerves. Below, I unpack these experiences, the talks I attended, and the broader trends shaping our ecosystem.

My Talk: “Empowering ML Workloads With Kubeflow”

Kubeflow Summit

Setting the Stage

I co-presented with Hezhi Xie at the Kubeflow Summit on Empowering ML Workloads with Kubeflow: JAX Distributed Training and LLM Hyperparameter Optimization.”

We dived into two major extensions to Kubeflow’s capabilities:

Distributed JAX Training
We showcased how the Kubeflow Training Operator now supports distributed training workloads using JAX on Kubernetes, enabling seamless scaling of high-performance computations.

Automated LLM Hyperparameter Optimization
We demoed a high-level API for tuning hyperparameters of LLMs that automates the process of hyperparameter optimization in Kubernetes.

Thank you to the DevZero team for amplifying this effort and for your support and encouragement every step of the way. ❤️

Talks I Attended

Throughout the event, I curated my schedule around sessions that pushed Kubernetes, MLOps, and infrastructure efficiency forward. Here are a few highlights:

1. Techniques and Insights To Test Kubernetes Limits With Kind

Antonio Ojea and Katarzyna Lach from Google showed how Kind can emulate large-scale clusters to root-cause DNS and kube-proxy performance bugs locally. Their emphasis on breaking down problems into narrow API interactions and scripting reproducible tests was eye-opening—NF tables outperformed IP tables by orders of magnitude in their benchmarks. 

2. Scale Smarter Not Harder: How Extending Cluster Autoscaler Saves Millions 

Speaker: Rahul Rangith & Ben Hinthorne, Datadog

Datadog’s talk detailed how they extended the Cluster Autoscaler with a gRPC expander to evaluate cost, performance, and reliability when choosing instance types. The Node Group Set and Instance Score Controller patterns they showcased enabled them to save millions by optimizing bin-packing across dozens of clusters.

3. The Next Generation of DaemonSet Autoscaling

Adam Bernot (Google Cloud) and Bryan Boreham (Grafana Labs) proposed a Vertical Pod Autoscaler enhancement to tune resource requests on a per-node basis for DaemonSets. Their demo of scoped VPAs adapting CPU allocations without manual tuning underscored how Kubernetes is evolving to handle heterogenous clusters more efficiently.

4. Hot Takes: Kubernetes “Paintainers” Bring the Heat

Ian Coldwater, Marly Salazar, Jeffrey Sica, Kat Cosgrove, and Xander Grzywinski took the stage Hot Ones–style, spurring candid insights on governance, overrated buzzwords (AI/LLMs and GitOps UX topped the list), and burnout prevention. Their honest advice on setting boundaries and advocating for well-being was a powerful reminder of the human side of open source.

5. Learning Lounge: CNCF Kubernetes Certifications

Speaker: Chad M. Crowell (KubeSkills)

Emerging Trends

Cloud-Native MLOps Maturity

Kubeflow sessions clustered around production readiness: multi-tenant isolation, cost-aware scheduling, and full-lifecycle metadata tracking. It’s clear ML in the cloud-native world is no longer experimental—it’s becoming enterprise standard.

Infrastructure Observability & Testing

From Kind-driven chaos tests to autoscaler expanders, the community is doubling down on built-in observability and pre-production validation. “Test early, monitor everywhere” was the mantra I heard most.

Dynamic Resource Allocation (DRA) has become a major theme this year, featuring in at least ten conference sessions either as their primary focus or as a key reference point.

Community & Culture Focus

Panels and social events like KubeClash reinforced that human connection remains the heartbeat of open source—even in hybrid work times.

DevZero Booth & Community Vibes

Booth

Between sessions, I spent time at the DevZero booth, where our live demos on cloud cost optimization drew steady crowds. Highlights included:

Conversations at the booth made it clear: teams are actively looking for better ways to reduce cloud waste without compromising performance. That’s exactly what we’re building at DevZero—a Kubernetes cost optimization platform that automatically rightsizes CPU and memory based on real workload behavior.

From hallway chats to afterparties (KubeClash was epic!), we should always set aside time for meaningful connections.

Final Thoughts

KubeCon + CloudNativeCon EU 2025 was more than a conference—it was a reminder of how vibrant and fast-moving the cloud native ecosystem is, especially at the intersection with ML. I leave London energized and with a deeper appreciation for the community that powers open source.