Member Blog Post

Guest post originally published on DoiT International’s blog by Joshua Fox

Kubernetes makes it easy to orchestrate the pods that run your applications. But what about the clusters that the pods run on? These need a lot more hands-on care to configure correctly across the clouds, and in particular, to clone a configuration of a cluster from one cloud to another.

In this article, I compare cluster models across Azure, Google, and Amazon and present a tool for cloning clusters within one cloud or cross-cloud.

Dolly, the first cloned mammal

With Kubernetes, managing your application is easy: The Kubernetes controller makes sure that your application has the resources needed, including memory, disk, load balancers, and redundant instances of your servers. Everything is standardized and automated, and you can switch infrastructure providers with hardly a change.

The Kubernetes’ abstraction insulates you from the infrastructure level, but the infrastructure is still there. A cluster’s nodes are just VMs, which are not standardized across cloud providers. For a newcomer, this is frustrating. Why do we have to figure out the necessary resources and APIs as modeled differently on the various clouds, when everything else about Kubernetes is so standard and automated?

Serverless container architectures like AWS Fargate, Google Cloud Run, and Azure Container Instances make it easier to deal with the infrastructure (at the cost of less control), but they do not give the cross-cloud standardization you expect from Kubernetes. If you are launching standard Kubernetes clusters, you need to understand the non-standard cluster infrastructure models.

This brings us to Cluster Cloner, an open-source tool that can copy clusters from cloud to cloud, or inside a single cloud. I will use it to compare the implementations of clusters in Google Kubernetes Engine, Azure Kubernetes Service, and AWS’s Elastic Kubernetes Service.

Meet Cluster Cloner

If you want to clone your cluster’s Kubernetes configuration, you can use Heptio Velero. The Kubernetes etcd makes this straightforward, as the full configuration is stored there. But before you do that, you need to copy the cluster’s virtual machines and the other resources on which Kubernetes runs. Cluster Cloner is a command-line tool that I developed in Go, available both as a Docker image and as a standalone application.

If you use Terraform or other infrastructure-as-code tools, copying clusters is not a problem: Just run the same code (with appropriate adjustments). But when a cluster has been created in other ways, you need a tool that can understand the cluster as modeled in one cloud, and translate it to another cloud’s model.

You can use Cluster Cloner in one of two ways: Either by directly copying a cluster, or else as “dry run,” with an intermediate step allowing you to tweak the process.

Option #1: Direct clone

To copy clusters, run clustercloner, specifying the input cloud and the “scope”. (The scope is the Google project or Azure Resource Group; it is not needed in AWS.)

Also, specify the input location. In AWS and Azure, that is the region. In Google, give the zone name for zonal clusters and the regional name for multizonal regional clusters.

Likewise, specify the output cloud; and the output scope for Azure or GCP.

Because every action can take many minutes ( details below), creation is by default a dry run. Specify --nodryrun to actually create the clusters.

Cluster Cloner copies all the clusters in the given location, but you can restrict the copying by specifying --labelfilter with a comma-separated list of key-value pairs. In this case, all labels must match for a cluster to be copied.

An example command line:

clustercloner --inputcloud GCP --inputscope my-google-project --inputlocation us-central1-c --outputcloud AWS --key-a=value-a,key-b=value-b --nodryrun

Documentation about the command line is available through clustercloner --help. Documentation on developing, running the unit and integration tests, and building the binary and Docker image are in the README file.

Option #2: Customize the cloning

You can also customize the process. Cluster Cloner always outputs a description of the target cluster in JSON. So, do a dry run by omitting --nodryrun, and edit the output JSON as a specification of the cluster you want. Then run clustercloner again, providing the JSON filename as an argument for the --inputfile option. The clusters defined in the file will then be created.

EKS, AKS, and GKE: Similarities & Differences

Cluster Cloner is built to bridge between the three very different models for cluster infrastructure. Next, let’s compare the basic elements that make up a cluster in Google, AWS, and Azure.

We will not, however, compare advanced features, since the three cloud providers are always racing ahead to keep up with each other. Also, this article is based on Cluster Cloner, which is focused on transforming the compatible parts of the core models. Some aspects of the models are not copied because they are specific to the design of a specific cloud, like GKE’s master authorized networks and Alpha clusters.

Comparing the cluster models

Terminology

Let’s start with the terminology, the easiest part of the comparison of the cluster models, as expressed in the APIs.

In the three clouds, we have the following terms. Inside Cluster Cloner, GKE terminology is generally used, recognizing the origin of the Kubernetes project within Google.

Table showing terminology comparison between EKS, GKE and AKS

Cluster and node pool model

Kubernetes Versions

Regions

Zones

Machine types

Speeds

Cluster Cloner provides an excellent opportunity to measure speeds given that the same action is performed in each cloud. See the chart below.

Bar chart showing time for actions in the Cluster APIs

Waiting a few more minutes does not have to be a showstopper, but it does affect your ability to treat clusters as you are used to treating pods — as “cattle not pets,” to be deleted and recreated at will.

The ins and outs of EKS APIs

Cluster Cloner uses eksctl for creating and deleting clusters and node groups. Since eksctl is a command-line tool not designed for API use, Cluster Cloner wraps it at the command-line entry point. We use eksctl for creation because it is smart enough to generate all the underlying resources needed for a cluster. This includes a VPC, subnets across availability zones, and security groups. Just as importantly, eksctl can tear down these resources, in the right order: First network interfaces, then VPC, and only then the EC2 VM instances.

For reading the clusters, Cluster Cloner does not use eksctl, which cannot describe the full detailed model of clusters and node groups. This is surprising, given that eksctl is now AWS’s recommended tool for managing clusters. The limitation comes from the fact that in EKS, the infrastructure layer (EC2 with instances, subnets, and VPCs) is more clearly visible than the other clouds, and less hidden by Kubernetes-related abstractions like nodes and node groups. Still, the information is available: Cluster Cloner uses the AWS SDK for Go to get all the data needed to characterize clusters and node groups.

What lies ahead

The cluster model, in all three providers, is rich and provides configuration options that are not yet supported by Cluster Cloner.

The features list shows what Cluster Cloner knows how to copy, as well as unimplemented aspects, for example:

If you want a feature, feel free to contribute a pull request on GitHub or add a GitHub issue.

Work with Joshua at DoiT International! Apply for Engineering openings on our careers site.