Building and running modern applications begins with selecting Kubernetes distribution as a baseline. Once a platform team has selected its orchestration layer, one of the next architectural choices involves the deployment architecture where that cluster will run. Containers can be deployed directly on either bare metal servers or virtual machines. This article examines the characteristics, tradeoffs, and community learnings around deploying containers on bare metal compared to virtual machines.
Historically, running workloads in containers on bare metal appealed to organizations that prioritized maximum performance and minimal infrastructure overhead. By bypassing the hypervisor layer, containers could directly access compute and storage resources.
However, advancements in hypervisor technology have significantly improved the performance and efficiency of virtualized environments, making containers on virtual machines (VMs) viable for production workloads with added operational benefits and flexibility.
As IT requirements have grown in recent years, platform teams now face expanded responsibilities, including enforcing stricter security policies, reducing fault domains for higher application availability, supporting multiple versions of conformant Kubernetes, and meeting tighter service-level agreements (SLAs). These are themes frequently discussed in CNCF TAG Runtime and TAG Security, particularly around multi-tenancy models, workload isolation, and lifecycle management.
These modern demands and technological enhancements have renewed community discussions on where clusters should run and how platform teams can meet SLA, security, and multi-version requirements IT practitioners should consider the following factors when making this decision:
Performance
Historically, bare metal offered a performance edge. Direct hardware access reduced latency and overhead, giving it an advantage in CPU- or GPU-intensive workloads. Recent benchmark studies indicate that the historical performance gap between bare metal and virtualized environments is now negligible. According to MLPerf benchmark tests, containers running on VM platforms can retain up to 99% of bare-metal performance for AI/ML workloads using vGPU. In CNCF projects focused on AI/ML—such as KServe, Kubeflow, and Volcano—platform teams commonly operate across both bare-metal and virtualized environments depending on workload type and scheduling requirements.
Multiple Conformant Kubernetes Versions
Bare metal environments typically support only one Kubernetes version per host. In contrast, containers on VMs allow multiple Kubernetes versions per host, improving host utilization, enabling efficient capacity planning, and supporting more flexible upgrade paths. Operational benefits include the ability to update the Kubernetes version for one application cluster without having to update all application clusters on that same physical cluster.
Security and Isolation
Namespaces are not designed to serve as security boundaries. The core issue in bare metal is that namespaces do not create strong security or isolation boundaries because all containers within a Kubernetes cluster — even across different namespaces — ultimately share the same host kernel. This means a compromise in one container can potentially affect others due to the shared underlying operating system. Stronger isolation is crucial for security (preventing lateral threat movement), compliance (meeting regulations), and multi-tenancy (safely hosting diverse workloads and tenants). VM-based environments provide enhanced isolation for workloads, as each virtual machine operates with its own kernel. Running containers on VMs strengthens workload isolation and reduces cross-tenant blast radius.. This topic is a recurring focus in CNCF TAG Security discussions, especially as organizations adopt multi-tenant platform engineering models.
Resource Guarantees and SLAs
In bare-metal configurations, namespaces enforce “soft” resource limits that can be exceeded when other workloads require those resources. While CPU and memory are assigned to containers, availability is not guaranteed. Containers in VM-based deployments enforce “hard” resource limits at the hypervisor level, with assigned resources reserved and unavailable to other workloads regardless of system demand. This also addresses “noisy neighbour” problems. This model offers more predictable scheduling and can make it easier for platform teams to meet internal SLAs.
Choosing the right architecture
Modern hypervisors bring isolation, version flexibility, and near-native performance, which has led many organizations to standardize on VM-based Kubernetes for multi-tenant platforms.Bare metal deployments will have a place for highly specialized or latency-sensitive use cases.
As infrastructure strategies evolve, platform teams are evaluating what combination of virtualized, bare-metal, and GPU-rich environments align best with their operational, performance, and governance requirements.
Leading public cloud providers — host billions of containers globally for thousands of customers — have made a deliberate choice to run their managed Kubernetes services on VM-based infrastructure, not bare metal. This decision reflects a balance between performance parity in most workloads, and the ability to meet strict isolation, security, and SLAs while supporting multiple Kubernetes versions across tenants at scale.
Within the CNCF ecosystem, we see a wide variety of architectures: edge deployments that rely heavily on bare metal, AI/ML platforms using GPU nodes on either model, and enterprise platforms using VMs to simplify lifecycle management. This diversity reflects CNCF’s stance: Kubernetes runs everywhere, and architectural choices depend on workload and governance needs.
As business requirements continue to evolve , the question of which model best fits your workloads and operational goals becomes increasingly relevant for architects and practitioners. It ultimately boils down to what matters most for your organization – absolute performance, or stronger isolation, operational flexibility and predictable SLAs?