Scaling Managed Kubernetes Service with Cilium and eBPF at OVHcloud
Challenge
OVHcloud is a global cloud provider operating 44 data centers across four continents. The company designs its own infrastructure, from servers to cooling systems, enabling efficient, secure, and sustainable cloud services spanning public and private cloud, bare metal, storage, and managed platforms.
Among its offerings, the Managed Kubernetes Service (MKS) helps customers adopt cloud native technologies with minimal operational overhead. Launched in 2019, MKS now supports thousands of production clusters across 20 public cloud regions, running tens of thousands of nodes. This architecture demands reliable, high-performance networking for a wide spectrum of customers, from small development teams to enterprises running mission-critical workloads. The service relies on a fully open source stack including ArgoCD, Cilium, Cluster API, Kamaji, and Kubernetes.
OVHcloud’s decision to adopt Cilium was driven by customer demand and the need for a future-ready networking layer.
“Cilium gives us the right foundation to build the next generation of our Managed Kubernetes Service. One that’s secure by design, highly observable, and ready for multi-cluster networking”
Joël Le Corre, Cloud Architect, Containers & Orchestration at OVHcloud
Highlights
- Cilium powers networking for MKS “Standard” plan clusters
- Thousands of production clusters running on OVHcloud’s infrastructure
- Hubble provides native observability for traffic flows and network policy enforcement
Solution: eBPF-Powered Networking for Multi-Tenant Kubernetes
In its first-generation MKS architecture, OVHcloud used Canal (Calico + Flannel) as the CNI. This stack worked well, Calico provided network policy enforcement while Flannel handled VXLAN-based networking, but customer expectations evolved.
“Many users want the ability to choose their CNI during cluster creation, and Cilium is the clear leader in this space,” said Le Corre. “Its feature set also allows us to provide observability, metrics, and monitoring via Hubble, meeting customers’ expectations for transparency and operational insight.”
Cilium’s eBPF-based architecture offers key advantages, including:
- Advanced network policies: Fine-grained control strengthens tenant isolation and security.
- Reduced management cluster load: Cilium’s efficient handling of endpoints through CiliumEndpointSlices reduces load on multi-tenant management clusters.
- Built-in observability: Hubble provides deep insight into traffic flows and system behavior.
- Future-ready architecture: Enables features like cluster meshing, service meshing, and CNI customization.
Deploying Cilium Across the MKS Stack
OVHcloud’s infrastructure relies on Kamaji for hosted control planes, Cluster API for lifecycle automation, and ArgoCD for GitOps-based deployments. Careful coordination was taken to integrate Cilium as the layer tying networking together across the stack.
“Leveraging our new GitOps stack built around ArgoCD has enabled the rollout of Cilium to new customers using the Standard version of MKS in both 1AZ and 3AZ regions,” said Le Corre. “Deployments are managed declaratively through an ArgoCD Application, which handles add-on installation across managed clusters in a consistent and scalable way.”
The MKS Standard offering was the first product to adopt Cilium as its default CNI. Work is also underway to make this new CNI available to customers on the Free product tier.
Operational Impact: Efficiency, Isolation, and Visibility
Adopting Cilium has delivered notable benefits across OVHcloud’s Managed Kubernetes Service platform. eBPF streamlines traffic processing at the kernel level, reducing load on management clusters that serve thousands of tenants. Advanced network policies enforce strict tenant separation, strengthening security across multi-tenant workloads without adding operational complexity.
“Hubble has been particularly impactful, giving internal teams visibility into network behavior without requiring additional tooling or instrumentation,” said Le Corre. “We are also working on a plan to extend these capabilities to all customers by using our API and the CiliumNodeConfig resource.”
In addition, tools such as cilium monitor and cilium connectivity have proven valuable in diagnosing unwanted behaviors related to load balancing.
Standardizing on Cilium across the platform has also simplified maintenance. The same networking foundation now underpins multiple OVHcloud offerings, including the Managed Rancher Service and Managed Private Registry, enabling consistent feature delivery across product tiers.
Roadmap
OVHcloud’s roadmap with Cilium focuses on expanding capabilities for customers. Users will soon be able to select their preferred CNI at cluster creation, giving teams flexibility to match their networking requirements. The team is also researching Cluster Mesh for cross-cluster connectivity to enable multi-region workloads to communicate seamlessly.
Hubble-powered metrics and monitoring will become available to all customers, extending the observability benefits currently used internally. On the backend, OVHcloud plans to strengthen its own management clusters with eBPF-based network policies while continuing to optimize throughput and latency for secure multi-tenant workloads at scale.
Both MKS Free and MKS Standard plans will soon use Cilium as the default networking layer.
The team has also begun contributing upstream, addressing issues like #42334, reflecting its commitment to improving the Cilium ecosystem and ensuring it can meet the demands of large-scale cloud operations.