Case Study

SNCF

Reaching for Strategic Autonomy by Building On-Premises Kubernetes with Cluster API and ArgoCD

Challenge: Closing the Agility Gap Between Public Cloud and Datacenters

As France’s national railway operator, SNCF runs critical transportation systems serving millions of passengers daily. Ensuring sovereignty, reliability, and operational resilience for the platforms powering these systems is a strategic priority.

After years of successfully operating Kubernetes in public clouds (Azure and AWS), SNCF faced a major hurdle: providing that same “Managed Kubernetes” experience within its own datacenters. Their first attempt at an on-premises platform was functional but “naïve”—it relied on manual processes and custom tooling.

The pain points were clear:

  • Sluggish provisioning: Delivering a new cluster took up to a month.
  • Stagnant growth: Only 14 clusters were deployed in four years due to operational overhead.
  • Day 2 friction: Upgrades, patching, and scaling were constant struggles, leading to “snowflake” clusters and configuration drift.
  • Feature gap: Essential cloud features like Node Autoscaling were missing on-premises.

SNCF needed a platform that could compete with Azure AKS or AWS EKS while maintaining full control over critical business data.

Industry:
Location:
Cloud Type:
Published:
March 25, 2026

Projects used

Solution: Modern problems require modern infrastructure

SNCF decided to stop applying incremental fixes to a legacy foundation and instead rebuild a state-of-the-art platform from the ground up.

“We soon concluded that applying incremental fixes on unsteady foundations would prove time-consuming without any guarantee of fixing the underlying issues. We needed to start again entirely.”

Yann Rotilio, Senior Staff Engineer – Kubernetes Specialist, SNCF

The Technical Stack: “Non-Toxic” and Interchangeable

The new production platform was designed using a modular, cloud-native stack:

The Game Changer: Node Autoscaling On-Premises

Cluster API was the central architectural decision. Beyond eliminating configuration drift, it allowed SNCF to activate node autoscaling in their own datacenters—a capability previously exclusive to public cloud providers. By using the CAPI OpenStack provider, SNCF achieved a unified workflow across hybrid environments.

“Cluster API was a game-changer. It gave us node autoscaling in our datacenters—something we thought was only possible with AKS or EKS.”

Yann Rotilio, Senior Staff Engineer – Kubernetes Specialist, SNCF

Extending GitOps Across the Hybrid Fleet

SNCF already operated hundreds of clusters in the public cloud via ArgoCD. By extending this existing GitOps implementation to the new datacenter clusters, they created a unified management layer. ORAS was integrated into the supply chain to manage Cluster API’s providers as OCI artifacts, ensuring seamless lifecycle management of every high-level primitive through GitOps.

Impact: Transformational Velocity

The shift from manual management to a declarative, CNCF-aligned stack resulted in an exponential increase in efficiency.

Notable Metrics:

MetricLegacy On-PremisesNew Cloud-Native Platform
Cluster Provisioning1 Month30 Minutes
Fleet Growth14 clusters in 4 years10 clusters in 6 months
Day 2 OperationsManual & ChallengingOn-demand & Automated
Configuration DriftHighZero-drift guarantee

Every Kubernetes cluster in the fleet is now updated monthly. This “zero-drift” posture ensures that the production environment is always in sync with the desired state, providing the reliability required for national rail operations.

“We set out to build a platform that could compete with AKS or EKS, but running in our own datacenters. With Cluster API, ArgoCD, and the CNCF ecosystem, we achieved exactly that—and the metrics speak for themselves.”

Thomas Comtet, Head of Container and Cloud Native Platforms, SNCF

Alignment and Future Horizons

SNCF is now fully aligned with the Linux Foundation and Open Infrastructure Foundation ecosystems. By choosing projects with transparent governance, they have secured their long-term technological sovereignty.

What’s Next?

The journey doesn’t end here. SNCF continues to expand its on-premises Kubernetes fleet and is exploring :

SNCF has proven that with the right CNCF building blocks, the “managed cloud experience” isn’t a location—it’s an operational standard that can be achieved anywhere.