Guest post by Tamao Nakahara, VP of Developer Experience, Weaveworks 

The Benefits of Stability and Reliability 

If you’ve come to this blog post, you’re probably interested in ways to be able to trust that everything will go well when you make a change to your app or clusters. Maybe you’re hoping that if things do go wrong, you can quickly find the info you need to troubleshoot and minimize downtime. Maybe you manage or service teams that want to spend more time innovating and making a difference to the business instead of fixing problems (and, worse, sometimes on weekends!).

Kubernetes and many technologies in the Kubernetes ecosystem already help many enterprise companies solve these challenges today. It’s inspiring to hear from established and risk-averse companies about how they are taking advantage of many tools in the CNCF. Gaining the benefits from Kubernetes isn’t only for start-ups or early adopters. Financial institutions, telecoms, online retail, and many many more types of companies have become advanced users of Kubernetes and especially of its declarative approaches and capabilities.

The Deets on Declarative

The companies that are now enjoying greater reliability and stability have made the transition to declarative. A declarative approach is one in which you have a file (eg. a yaml file) that declares how you want the cluster to be configured. If this technique is not yet familiar to you, we hope that this post will provide some of “what you need to know” to make it clear why this is the new status quo in deployment methodologies. We hope that you can join the companies that have benefited from this natural evolution of Kubernetes and the powerful CNCF tools that work together to make it possible. For this particular approach, we will cover open source CNCF projects: Kubernetes (with Kustomize), Helm, and Flux (plus a little bit of Prometheus). 

So what does this all mean for you?

It means that you have three technologies evolving and being designed to work together so that, if you have any bad actors or errors happening from a change made with kubectl, you have a whole system of mature, well-designed, and well-integrated tools that work together to make sure that the cluster goes back to the state that you declared through your config file. The result is a stable, dependable, and secure infrastructure that “self-heals” so that you and your team aren’t working late to fix any major fires manually. Moreover, because this approach relies on version control systems such as Git, you can check the logs for the change from kubectl and know how to address it immediately. 

Small Steps

If you or your team are used to a more imperative approach, here are some concepts, terminology, and references to know as you take small steps toward working declaratively. For instance, an “object config file,” is the file that defines the config for the Kubernetes object. These files are stored in a repo that, because of the declarative approach that we described, acts as the single source of truth. Kubernetes, Flux, and Helm work together to “check in” on this config file and to ensure that the cluster reflects the desired state. If not (because of manual and unapproved changes), the system will change the state back to what is defined in this object config file. That’s the self-healing in action.

As Kubernetes co-creator, Brendan Burns, has noted this trend is increasingly being seen as a natural evolution of Kubernetes itself. The CNCF project Flux is built to work seamlessly with these Kubernetes capabilities to provide continuous and progressive delivery based on this declarative strategy. For these capabilities, you’ve probably heard of the popular term, GitOps.

What is Continuous Delivery? In the case of Flux, Continuous Delivery is a deployment mechanism that works with your existing CI tools (so you don’t need to take them out, eg. Jenkins). Flux does its part in the declarative process that we’ve explained by keeping Kubernetes clusters in sync with config sources. When there is new code to deploy, Flux also automates those updates to the config. Importantly, Flux is built from the ground up to use the Kubernetes controller-runtime and Kubernetes’ API extension system so that you know that they can work together reliably. 

What is Progressive Delivery? In the case of Flux, this refers to its ability to provide canary and Blue/Green deployments using metrics from Prometheus (another established CNCF project). Flux does this through its subproject, Flagger. Flagger has successful enterprise users today and it shows this evolution of benefits from declarative approaches: from Kubernetes’ declarative capabilities by looking at the config file that declares the state of the cluster, to Helm’s concept of packaging and releasing applications to run on Kubernetes, and to Flux’s automating updates to those configs. Flagger follows these approaches and automates how it shifts traffic from an old deployment to a new one. It takes in Prometheus metrics and automatically decides how to manage its canary or blue/green deployments based on those metrics so that you can be assured that there won’t be any downtime. (Flagger takes metrics from other sources as well such as Datadog, New Relic, etc. See resources links below).

If you’re using imperative tools like kubectl or Helm CLI, you might consider a declarative approach for Helm. Helm is a package manager that helps you manage Kubernetes applications. A declarative approach to Helm means that you can have reliable unattended operations. A declarative approach brings greater reliability to Helm’s CLI if it fails, which can happen especially for dev teams that script their delivery flow with or without Helm. While the Helm CLI allows you to issue imperative commands each step of the way, having a declarative approach adds reliability to know that your cluster can revert to desired state automatically if something goes wrong. Flux helps make this happen by defining how users can declare the desired state of their Helm releases.

The reason that many users recommend using Helm with Flux is that Flux is designed to work natively with Helm’s SDK, making full use of Helm’s Go underlying library used by the CLI. Because of this, Flux integrates with Kubernetes’ core controller runtime to ensure the highest level of automated Helm release and handling consistency.

We know that this might be a big shift in thinking for you or your team. The many communities of professional contributors and maintainers within the CNCF have been dedicated to these projects so that your small steps in this direction can be less risky. 

We are here to help and are always in the CNCF slack channel. DM me @tamao or talk to us in the #flux channel. 

References

“object config file”: There’s more info in the Kubernetes docs on helping you to get a firm understanding of the Kubernetes object definitions and configurations. 

Brendan Burns on “GitOps as an evolution of Kubernetes.”

Flagger docs for Datadog, New Relic, etc.