Blog

Featured
When Kubernetes restarts your pod — And when it doesn’t
Project Maintainer Post When Kubernetes restarts your pod — And when it doesn’t
A production internals guide verified against Kubernetes 1.35 GACompanion repository: github.com/opscart/k8s-pod-restart-mechanics The terminology problem Engineers say “the pod restarted” when they mean four different things. Getting this wrong leads to flawed runbooks and bad on-call decisions....
March 17, 2026 | Shamsher Khan, Project Maintainer
  • Reset

Showing 862 of 2576 posts


Solving secret sprawl in multi-account Kubernetes with External Secrets Operator
Member Post Solving secret sprawl in multi-account Kubernetes with External Secrets Operator
Infrastructure provisioning in Kubernetes has become increasingly automated, but secret management often remains a challenge as environments grow. Organizations commonly separate development, staging, and production workloads across clusters, namespaces, or cloud accounts to improve security and...
June 9, 2026 | Viktoria Bisova, DevOps Engineer, Itigix

Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms 
Member Post Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms 
Breaking the single datacenter assumption Modern AI architectures are built on the assumption of centralized, homogeneous data centers. In reality, infrastructure is messy. For most organizations, compute resources are fragmented across private clouds, research environments, and...
June 8, 2026 | Prithvi Raj (Mirantis), Alexander Acker (Logsight.ai), and Soeren Becker (Logsight.ai)

Benchmarking KubeVirt performance with virtbench
Member Post Benchmarking KubeVirt performance with virtbench
Organizations migrating VM estates from traditional hypervisors to KubeVirt often discover that many Kubernetes observability tools were originally designed around container workloads rather than VM-centric operational metrics. While KubeVirt schedules VMs as pods, the performance variables...
June 8, 2026 | Bob Glithero, Senior Technical Product Marketing Manager, Portworx by Everpure

Securing CI/CD for an open source project: Controlling who runs what
Member Post Securing CI/CD for an open source project: Controlling who runs what
Part one The last twelve months have been rough on the open source supply chain. Axios was compromised on npm and shipped a remote access trojan inside otherwise normal-looking releases. LiteLLM’s PyPI package was hijacked to...
June 4, 2026 | André Martins, Cilium maintainer and Software Engineer, Isovalent at Cisco and Feroz Salam, Cilium Security Team and a Security Engineer, Isovalent at Cisco.

Dynamic configuration for cloud native Swift services
Member Post Dynamic configuration for cloud native Swift services
Modern Swift services increasingly run alongside the same cloud native infrastructure stacks that power much of today’s Kubernetes ecosystem — including ConfigMaps, containerized workloads, declarative deployments, and service lifecycle management. Projects such as Prometheus and OpenTelemetry...
June 1, 2026 | Joe Heck, Swift Documentation Workgroup Member, Apple

Zero-Downtime migration from ingress NGINX to Envoy Gateway
Member Post Zero-Downtime migration from ingress NGINX to Envoy Gateway
Teams running Ingress NGINX in production are increasingly evaluating migration paths as Kubernetes networking evolves toward Gateway API. For many organizations, the challenge is not just selecting a Gateway API implementation, but designing a migration strategy...
May 25, 2026 | Andrew Katsikas, Pelotech

Introducing Prempti: Policy and visibility for AI coding agents
Member Post Introducing Prempti: Policy and visibility for AI coding agents
AI coding agents have become a real part of the developer workflow. Tools like Claude Code sit in your terminal, read your files, run shell commands, make network requests, and write code, all on your behalf....
May 20, 2026 | Leonardo Grasso, Falco Maintainer

Building a cloud native platform from the ground up with Kairos, k0rdent, and bindy
Member Post Building a cloud native platform from the ground up with Kairos, k0rdent, and bindy
As we shared in our earlier post on FluxCD, RBC Capital Markets has been on a deliberate journey to modernize our Kubernetes platform. GitOps with FluxCD gave us a solid deployment foundation. But as our platform grew,...
May 13, 2026 | Erick Bourgeois, Director & Head of Kubernetes Platform Engineering, RBC Capital Markets

How to get engineering time back from Kubernetes upgrades
Member Post How to get engineering time back from Kubernetes upgrades
Kubernetes powers your products, but with that power and flexibility comes organizational challenges around managing complexity and maintenance. It can be tough for an organization to keep up with the speed of open source, especially at...
May 11, 2026 | Munib Ali, Director of Engineering, SRE Fairwinds

AI sandboxing is having its Kubernetes moment
Member Post AI sandboxing is having its Kubernetes moment
Recently, Anthropic announced that its new model, Mythos, had autonomously found and exploited zero-day vulnerabilities in every major operating system and web browser – including a 27-year-old bug that had survived decades of human review and...
April 30, 2026 | Jed Salazar, Field CTO, Edera

From Ingress NGINX to Higress: migrating 60+ resources in 30 minutes with AI
Member Post From Ingress NGINX to Higress: migrating 60+ resources in 30 minutes with AI
With the official retirement of Ingress NGINX that took place in March 2026, enterprise platform teams are facing an urgent security and compliance mandate. Remaining on a retired controller leaves critical infrastructure vulnerable to unpatched security...
April 23, 2026 | Tianyi Zhang, Alibaba

Auto-diagnosing Kubernetes alerts with HolmesGPT and CNCF tools
Member Post Auto-diagnosing Kubernetes alerts with HolmesGPT and CNCF tools
What a two-person SRE team learned building an AI investigation pipeline. Spoiler: the runbooks mattered more than the model. Why we built this At STCLab, our SRE team supports multiple Amazon EKS clusters running high-traffic production...
April 21, 2026 | Grace Park and Ihyeok Song, DevOps Engineer, STCLab SRE Team

GitOps policy-as-code: Securing Kubernetes with Argo CD and Kyverno
Member Post GitOps policy-as-code: Securing Kubernetes with Argo CD and Kyverno
A hands-on guide to deploying Kyverno with Argo CD and enforcing custom policies As Kubernetes environments develop, GitOps with Argo CD has become the standard for declarative, self-healing infrastructure. Yet without guardrails for your deployments, misconfigured,...
April 2, 2026 | Ivan Roussev, Igtix

LLMs on Kubernetes Part 1: Understanding the threat model
Member Post LLMs on Kubernetes Part 1: Understanding the threat model
Let’s say you’ve got an LLM running on Kubernetes. Pods are healthy, logs are clean, users are chatting. Everything looks fine. But here’s the thing: Kubernetes is great at scheduling workloads and keeping them isolated. It...
March 30, 2026 | Nigel Douglas, CloudSmith

Announcing a Kotlin Multiplatform API and SDK for OpenTelemetry
Member Post Announcing a Kotlin Multiplatform API and SDK for OpenTelemetry
OpenTelemetry has become the de facto standard for collecting and exporting telemetry data across cloud native systems. Its success has been driven by strong community collaboration, a clear specification, and a growing ecosystem of language-specific SDKs...
March 24, 2026 | By Jamie Lynch, Senior Software Engineer, Embrace (CNCF member company)

Understanding Kubernetes metrics: Best practices for effective monitoring
Member Post Understanding Kubernetes metrics: Best practices for effective monitoring
Kubernetes metrics show cluster activity. You need them to manage Kubernetes clusters, nodes, and applications. Without them, it also makes it harder to find problems and improve performance. This post will explain what Kubernetes metrics are,...
March 18, 2026 | Sam Suthar, Middleware

Registry Mirror Authentication with Kubernetes Secrets
Member Post Registry Mirror Authentication with Kubernetes Secrets
Part II: A Platform Integration Example In Part I, we explored the architecture of the CRI-O credential provider and walked through a manual setup. In this part, we’ll see how platforms like OpenShift and its upstream...
March 16, 2026 | Sascha Grunert, Red Hat

Making etcd incidents easier to debug in production Kubernetes
Member Post Making etcd incidents easier to debug in production Kubernetes
Diagnosing and Recovering etcd: Practical tools for Kubernetes Operators When Kubernetes clusters experience serious issues, the symptoms are often vague but the impact is immediate. Control plane requests slow down. API calls begin to time out....
March 12, 2026 | Natalie Fisher and Benjamin Wang, Broadcom

Registry mirror authentication with Kubernetes secrets
Member Post Registry mirror authentication with Kubernetes secrets
Part I: Architecture and Implementation In production Kubernetes clusters, pulling container images from private registries happens thousands of times per day. Kubernetes distributions from major cloud vendors provide credential providers for their respective registries like AWS...
March 9, 2026 | Sascha Grunert, Red Hat

The great migration: Why every AI platform is converging on Kubernetes
Member Post The great migration: Why every AI platform is converging on Kubernetes
When Kubernetes launched a decade ago, its promise was clear: make deploying microservices as simple as running a container. Fast forward to 2026, and Kubernetes is no longer “just” for stateless web services. In the CNCF...
March 5, 2026 | Vara Bonthu, Amazon Web Services Inc.