KServe becomes a CNCF incubating project

Posted on November 11, 2025 by KServe Project Maintainers

CNCF projects highlighted in this post

The CNCF Technical Oversight Committee (TOC) has voted to accept KServe as a CNCF incubating project. KServe joins a growing ecosystem of technologies tackling real-world challenges at the edge of cloud native infrastructure.

What is KServe?

KServe is a standardized distributed generative and predictive AI inference platform for scalable, multi-framework deployment on Kubernetes. It is designed as a single, unified platform for both generative and predictive AI inference on Kubernetes. While simple enough for quick deployments, KServe is powerful enough to handle enterprise-scale AI workloads with advanced features.

Key Milestones and Ecosystem Growth

KServe originated in 2019 as a collaborative effort by Google, IBM, Bloomberg, NVIDIA, and Seldon under the Kubeflow project. It was later donated to the LF AI & Data Foundation in February 2022. In September 2022, the project rebranded from KFServing to the standalone KServe, graduating from Kubeflow. KServe then moved to CNCF as an incubator in September 2025.

KServe has demonstrated consistent and growing adoption across diverse industries and geographies, with production deployments ranging from large-scale multi-cloud enterprise platforms to specialized internal AI infrastructure. The project is used by major organizations, including Bloomberg, Red Hat, Cloudera, CyberAgent, Nutanix, SAP, NVIDIA, and others, spanning sectors such as enterprise software, cloud infrastructure, online media, gaming, and financial services. Deployments support both generative and predictive AI workloads at scale. Use cases span multi-cloud enterprise AI platforms, large GPU clusters, internal developer platforms, and hyperscaler-agnostic services, serving thousands of tenants in some cases.

Integrations Across the Cloud Native Landscape

KServe connects seamlessly with many CNCF projects, including:

Kubernetes: KServe is fundamentally a platform for scalable, multi-framework deployment on Kubernetes. It simplifies ML model deployment using a Kubernetes Custom Resource Definition (CRD).
Envoy: KServe v0.15 introduced initial support for the Envoy AI Gateway, a CNCF open source project built on top of Envoy.
Kubeflow: KServe originated under the Kubeflow project in 2019 through a collaboration between Google, IBM, Bloomberg, NVIDIA, and Seldon. Although it graduated from Kubeflow and became the standalone KServe project in September 2022, it remains connected to the ecosystem.
vLLM: The vLLM backend has been significantly enhanced in KServe v0.15 to better serve generative AI models.
llm-d: KServe provides the integration with llm-d to support disaggregated serving, pre-fix caching, intelligent scheduling, variant autoscaling, and more through the new LLMInferenceService CRD.
LMCache: KServe integrates the LMCache library for Distributed Key-Value (KV) Cache.
Kubernetes Gateway API: KServe supports the Kubernetes Gateway API, alongside its integration with Envoy Gateway.
Knative: KServe supports serverless deployment through its Knative integration to provide autoscaling based on request volume and supports scale down to and from zero as well as canary rollout.
Istio: KServe integrates with Istio to be able to leverage service mesh functionalities.
Wide range of serving runtimes: KServe is a multi-framework platform, providing support for popular predictive AI frameworks, including TensorFlow, PyTorch, scikit-learn, XGBoost, ONNX, and more.

Technical Components

KServe: The main Kubernetes controller that reconciles KServe CRDs and other Kubernetes objects.
ModelMesh: A mature, general-purpose model serving the management and routing layer, designed for high-scale, high-density, and frequently changing model use cases.
Open Inference Protocol: Defines a standard protocol for performing machine learning model inference across serving runtimes for different ML frameworks.

Community Highlights

4.6k+ GitHub Stars
2400+ pull requests
2.1k issues
300+ contributors
40+ Releases
19 maintainers
30+ company adopters varying from vendors to end users

Maintainer Perspective

“Our journey to bring the KServe project to CNCF incubation is a testament to the community’s dedication to model serving on Kubernetes. We started with a vision to create an open and standardized inference platform, and now, with the rapid proliferation in generative AI adoption, that vision is more crucial than ever. Uniquely positioned to address the complex challenges of serving large language models (LLMs), KServe provides the community with a robust, scalable, and cost-effective solution. Its move to the CNCF will provide the project with a vendor-neutral home and a clear path for collaboration as we continue to build the future of AI inference together.”

—Dan Sun, KServe Co-founder and Engineering Team Lead, Cloud Native Compute Services – AI Inference Engineering at Bloomberg

“This marks an important milestone for the KServe community and a significant step forward in cloud-native model serving. It’s inspiring to see KServe adopted by so many organizations and to witness the ongoing collaboration across AI and cloud-native projects. We’re excited to continue building on this momentum and deepen our collaboration within the broader cloud-native ecosystem. A big thank-you to everyone who contributed to this accomplishment!”

—Yuan Tang, KServe Project Lead, Senior Principal Software Engineer at Red Hat

From the TOC

“KServe stands out as a project that truly embodies the cloud native spirit—technically strong, community-driven, and deeply collaborative. Throughout the incubation due diligence process, I was struck by just how mature the architecture, governance, and adoption patterns already were. The maintainers were consistently responsive and open to feedback—that kind of trust and engagement is rare and invaluable.

On the technical front, KServe’s rich integration with Envoy, Knative, and the Gateway API anchors it powerfully within the CNCF ecosystem. Its support for multi-node inference, autoscaling (including scale-to-zero), inference pipelines, routing, and model explainability demonstrates that this project is built for enterprise-scale AI workloads. The community’s welcoming nature has made it easy for new contributors and adopters to get involved, which speaks volumes about its health and inclusiveness.

It’s also been inspiring to see a growing number of end users—from startups to major enterprises—actively adopting KServe and contributing back to its evolution. I’m honored to welcome KServe into incubation and excited to see how this vibrant community continues to shape the future of AI inference in cloud native environments.”

— Faseela K, CNCF TOC Sponsor

“The rising complexity of modern AI workloads drives an urgent need for robust, standardized model serving platforms on Kubernetes. KServe’s acceptance into the CNCF as an Incubating project highlights this demand. Its focus on scalability, particularly multi-node inference for large language models, is key to providing efficient serving and deployment solutions for cloud native AI infrastructure. We look forward to this milestone catalyzing further innovations within the CNCF ecosystem, advancing how cloud native technologies empower intelligent workloads across industries.”

— Kevin Wang, TOC Sponsor

Looking Ahead

KServe will continuously improve its existing features for predictive and generative inference. To meet the increasing demand for GenAI applications, KServe’s goal is to evolve to a fully abstracted, elastic inference platform where users solely focus on models and pre/post-processing while KServe handles the orchestration, scaling, resource management, and deployment.

As a CNCF-hosted project, KServe is part of a neutral foundation aligned with its technical interests, as well as the larger Linux Foundation, which provides governance, marketing support, and community outreach. KServe joins incubating technologies Backstage, Buildpacks, cert-manager, Chaos Mesh, CloudEvents, Container Network Interface (CNI), Contour, Cortex, CubeFS, Dapr, Dragonfly, Emissary-Ingress, Falco, gRPC, in-toto, Keptn, Keycloak, Knative, KubeEdge, Kubeflow, KubeVela, KubeVirt, Kyverno, Litmus, Longhorn, NATS, Notary, OpenFeature, OpenKruise, OpenMetrics, OpenTelemetry, Operator Framework, Thanos, and Volcano. For more information on maturity requirements for each level, please visit the CNCF Graduation Criteria.

Mumbai, India