Beginner’s guide to Kuma service mesh

Posted on August 28, 2023

CNCF projects highlighted in this post

Guest post originally published on the InfraCloud blog by Sonali Srivastava

The concept of service mesh emerged as a response to the growing popularity of cloud native environments, microservices architecture, and Kubernetes. It has its roots in the three-tiered model of application architecture. While Kubernetes helped resolve deployment challenges, the communication between microservices remained a source of unreliability. Under a heavy load, the application could break if the traffic routing, load balancing, etc. were not optimized. This led to the growth of service mesh. With the existing service mesh hard to scale due to too many moving parts, configure and manage, Kong built a service mesh tool called Kuma.

In this blog post, we will be talking about the open source service mesh Kuma, its architecture, and its easy-to-implement policies like traffic control, metrics, circuit breaking, etc. We will also discuss how Kuma provides better observability of services.

What is Kuma?

Kuma is an open source service mesh for Kubernetes, VM, and bare metal environments. Along with horizontal scalability support, it also offers multi-zone, multi-cluster, and multi-cloud support. It is enterprise-ready and supports multiple individual meshes as shown in the image below that helps lower the operating costs of supporting the entire organization.

(Kuma Overview)

Kuma architecture

In this section, we are going to talk about high-level visualization and the components of a Kuma service mesh. It will help us to understand how to integrate the application’s services into the mesh. There are two main following components.

Data plane: To deal with network complexities and segment the business logic, each service is deployed with a sidecar proxy. The proxy handles the traffic of the mesh. Kuma is built on top of the Envoy proxy. A data plane proxy is configured for each replica of the service to handle failure issues and also to assign an identity to the proxy and correspondingly to the service.The data plane proxy (DPP) is composed of primarily two components:
- Dataplane: It is the entity that has the data plane configuration. It has the inbounds and outbounds defined along with the IP address that other DPP use to search each other. For Kubernetes-based infrastructure, the control plane automatically generates the Dataplane. For Universal setup, the user defines the Dataplane entity.
- kuma-dp: This is a binary that has spawned two subprocesses:
  - envoy: This binary invokes Envoy and abstracts certain complexities of a proxy that are not required to be dealt with.
  - core-dns: It resolves Kuma-specific DNS entries.
Control plane: With a production-grade deployment, there will be many data plane proxies to manage. You do not want to overburden your platform engineers with the manual tasks of updating policies, enabling monitoring and configurations on each data plane proxy. This is where the control plane comes into the picture. It helps with configuring the data plane proxies and managing the traffic going through the mesh.Kuma is the control plane and is shipped in a kuma-cp binary in a mesh. For the data plane proxies to receive configurations from the control plane, kuma-cp implements the Envoy xDS APIs. Kuma provides a CLI tool kumactl to interact with the control plane. Kuma maintains a data store based on the infrastructure model.

(Kuma Architecture)

Kuma supports both universal as well as Kubernetes-native infrastructure. Let us understand the architecture for each of the infrastructure modes.

Kubernetes mode: Kuma uses the Kubernetes API server as the data store to manage policies and configurations. In the kuma-cp, the sidecar injection is enabled to include service in the mesh.
Universal mode: Kuma service mesh can be enabled via the Kuma API. PostgreSQL serves as the data store for the control plane.

For any other infrastructure other than Kubernetes, refer to Kuma’s official documentation. Let us understand how Kuma works on a Kubernetes-native infrastructure.

How does Kuma Service Mesh work?

The kuma-dp has the sidecar proxy deployed with each service and its replica. It exists on the execution path of all the incoming and outgoing requests to the service. On the contrary, the kuma-cp configures the proxies but does not sit on the execution path of the services.

Flowchart showing Kuma Service Mesh workflow

(Working of Kuma)

The Kuma control plane is set up first using the kumactl CLI tool. Kuma supports both deployment topologies, standalone as well as multizone architecture or a hybrid of these. So based on the topology you have, you can use kumactl to create a control plane.

The control plane deployment has the kuma-cp executable running. Once the control plane is up and running, the kuma-injector can be enabled in all the Kubernetes resources by updating the namespaces. Once namespaces are updated and the application deployment starts, Kuma initiates the pod.

During each pod initialization, kuma-init sets up default transparent proxying to allow traffic only via proxy with predefined ports. This kuma-init container terminates on successful execution with Exit Code 0.

Kuma also provides a plugin called the Kuma CNI plugin that can be manually configured to enable proxying. We have used the default kuma-init in this blog for setup. Here is a snippet showing the successful execution of the kuma-init container for the carts service. You can simply describe the pod to view the init container details.

The application and kuma-sidecar containers come up once the pre-sidecar initialization is complete. The kuma-dp executable runs on each sidecar to the service. The kuma-dp initializes the envoy binary which connects the data plane with the control plane. The control plane then provides the real-time configurations and policies to the data plane.

It is the task of the data plane proxy to enforce them on the service-to-service communication in the network. With this, all the connectivity, security, and routing activities are easily handled by the proxy. To securely allow the outside traffic inside the mesh, Kuma provides the ability to enable a gateway for network traffic proxying. You can refer to the configuration from the Kuma Gateway.

How to implement a Service Mesh with Kuma?

Let us see how we can enable service mesh for a microservice application running on the Kubernetes cluster and understand the benefits of its features.

Pre-requisites

Kubernetes cluster: We are using Minikube
Microservice-based application: the sock shop

In order to ensure your application does not have any compatibility issues, you can refer to the Kuma requirements.

How to set up Kuma?

For setting up the control plane and the data plane, Kuma provides a demo application that you can make use of. Here is the overview of the deployment steps:

Select the deployment topology based on your infrastructure: standalone or multizone. We are using a standalone deployment topology.
Install kumactl. Please note when we worked on this blog, the latest version of Kuma was 2.2.x.
Install control plane using kumactl. When topology mode is not passed with kumactl, Kuma will start standalone mode by default. For multizone deployment, set –mode=global
Enable automatic sidecar injection by adding the label to the application’s pod or namespace as kuma.io/sidecar-injection: enabled to configure the data plane proxy. Apply the namespace changes and then apply all the other manifests in the namespace to kickstart initialization.

With this, the Kuma service mesh will be configured and any new resource when deployed in the cluster’s namespace will have the sidecar automatically injected. Let us see this with the help of the sock shop application.

Deploy The Sock Shop

Use the following commands to clone the repo:

git clone https://github.com/microservices-demo/microservices-demo

Enable automatic sidecar injection by adding the label to the namespace manifest.

cat deploy/kubernetes/manifests/00-sock-shop-ns.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: sock-shop
  labels:
    kuma.io/sidecar-injection: enabled

Create a control plane for standalone deployment topology using kumactl.

kumactl install control-plane | kubectl apply -f -

Deploy the application by first creating the namespace.

kubectl create -f deploy/kubernetes/manifests/00-sock-shop-ns.yaml -f deploy/kubernetes/manifests

Verify deployment

$ kubectl get pods -A
NAMESPACE     NAME                                  READY   STATUS    RESTARTS      AGE
kube-system   coredns-565d847f94-57xkh              1/1     Running   1 (42m ago)   24h
kube-system   etcd-minikube                         1/1     Running   1 (42m ago)   24h
kube-system   kube-apiserver-minikube               1/1     Running   1 (42m ago)   24h
kube-system   kube-controller-manager-minikube      1/1     Running   1 (42m ago)   24h
kube-system   kube-proxy-j4rhb                      1/1     Running   1 (42m ago)   24h
kube-system   kube-scheduler-minikube               1/1     Running   1 (42m ago)   24h
kube-system   storage-provisioner                   1/1     Running   2 (41m ago)   24h
kuma-system   kuma-control-plane-67dd5d9496-d8jxw   1/1     Running   0               19m
sock-shop     carts-78d7c69cb8-d6872                2/2     Running   0               9m40s
sock-shop     carts-db-66c4569f54-pk5k9             2/2     Running   0               9m40s
sock-shop     catalogue-7dc9464f59-tdgkj            2/2     Running   0               9m40s
sock-shop     catalogue-db-669d5dbf48-m966d         2/2     Running   0               9m40s
sock-shop     front-end-7d89d49d6b-8kpsp            2/2     Running   0               9m40s
sock-shop     orders-6697b9d66d-htfqq               2/2     Running   0               9m40s
sock-shop     orders-db-7fd77d9556-swl5b            2/2     Running   0               9m40s
sock-shop     payment-ff86cd6f8-zxgxc               2/2     Running   0               9m40s
sock-shop     queue-master-747c9f9cf9-mr7cp         2/2     Running   0               9m40s
sock-shop     rabbitmq-6c7dfd98f6-5ptmp             3/3     Running   0               9m40s
sock-shop     session-db-6747f74f56-pr5mm           2/2     Running   0               9m40s
sock-shop     shipping-74586cc59d-td7cp             2/2     Running   0               9m40s
sock-shop     user-5b695f9cbd-tl8s5                 2/2     Running   0               9m40s
sock-shop     user-db-bcc86b99d-wr5c6               2/2     Running   0               9m40s

Hurray!! The application is deployed successfully. We can see here two new namespaces.

kuma-system: The Kuma control plane pod is running in this namespace.

sock-shop: The application’s pod has 2 or 3 containers in this namespace. Kuma creates a sidecar container in each pod after the automatic sidecar injection is enabled.

Verify applicationTo verify if the application is running, let us access the front-end service using the IP address provided. For minikube, we’ll use a minikube tunnel to enable access to the application. Get the URL using the following command.

$ minikube service list | grep front-end
| sock-shop   | front-end          |           80 | http://192.168.58.2:30001 |

Now navigate the URL provided: http://192.168.58.2:30001

Screenshot showing weavesocks front end page to verify application

Verify service mesh Kuma provides a GUI that helps in viewing the Kuma resources. To access the GUI, we need to first port-forward the API service with:

kubectl port-forward svc/kuma-control-plane -n kuma-system 5681:5681

And then navigate to http://127.0.0.1:5681/gui

If we click on Policies, we will see that Kuma has already enabled policies like CircuitBreaker, TrafficPermissions, TrafficRoute, Timeout, and Retry by default. We can view the data plane proxies for each service, its status, and the policies enabled. Currently, we see no default gateway is set up thus gateway count is 0. In the coming sections, we will explore some features of Kuma that help make the application more robust and also enable security at the entrance of the mesh using Kuma Gateway.

Kuma Gateway

For the services that need to receive traffic from the external network, Kuma Gateway routes those requests inside the mesh for the service to accept it via the proxy. In order to do that, Kuma sets up a pod dedicated to the gateway once configured. Kuma supports two types of gateway:

delegated: It allows the user to use any existing gateway like Kong.
builtin: It requires configuring the DPP to expose external listeners to drive traffic inside the mesh.

You can understand the comparison between delegated and builtin from the documentation.

We are going to create and configure the gateway with builtin type that listens for the traffic from the outside of mesh and forwards it to the sock-shop front-end. Here are the 3 things that we need to set up:

MeshGatewayInstance: Kuma comes with a built-in type MesGatewayInstance that manages and deploys the proxy to serve gateway traffic. Here is the configuration:

apiVersion: kuma.io/v1alpha1
kind: MeshGatewayInstance
metadata:
  name: sock-shop-gateway
  namespace: sock-shop
spec:
  replicas: 1
  serviceType: LoadBalancer
  tags:
    kuma.io/service: sock-shop_gateway

In this configuration, we have defined to setup an instance of the specified kind with 1 replica and load balancer type. With the instance configured, we need to enable listeners for the gateway.

MeshGateway: This helps us configure the listeners. Here is the configuration:

apiVersion: kuma.io/v1alpha1
kind: MeshGateway
mesh: default
metadata:
  name: sock-shop
spec:
  conf:
    listeners:
    - port: 80
      protocol: HTTP
  selectors:
  - match:
      kuma.io/service: sock-shop_gateway

In this configuration, we have defined the port and protocol for the gateway. Now we need to define the routes.

MeshGatewayRoute: This is used to define the destination route to route incoming traffic from the listener to other services.

apiVersion: kuma.io/v1alpha1
kind: MeshGatewayRoute
mesh: default
metadata:
  name: sock-shop
spec:
  conf:
  http:
    rules:
    - matches:
      - path:
          match: PREFIX
          value: /
      backends:
      - destination:
          kuma.io/service: front-end_sock-shop_svc_80
        weight: 1
  selectors:
  - match:
      kuma.io/service: sock-shop_gateway

Here we have defined the backend destination as the front-end service for the prefix match at /. Weight is the proportion of requests this backend will receive when a forwarding rule specifies multiple backends. We have set it to 1. Also, the listener is set to sock-shop_gateway to identify incoming requests.

We have kept all three above-defined configurations in the same file. Once this is applied, Kuma will launch a create and configure a builtin type gateway that will route the incoming requests on / to the front-end service.

Let us go ahead and apply the gateway configurations.

$ kubectl apply -f gateway.yaml
meshgateway.kuma.io/sock-shop created
meshgatewayroute.kuma.io/sock-shop created
meshgatewayinstance.kuma.io/sock-shop-gateway created

Verify if the pod is up:

$ kubectl get pods -A | grep gateway
sock-shop     sock-shop-gateway-84f7fb5687-phsvl    1/1     Running   0

As we can see, the pod is running. Let us get the IP of the gateway to send the external traffic to the service:

$ minikube service list | grep gateway
| sock-shop   | sock-shop-gateway  | 80/80        | http://192.168.49.2:30526 |

Finally, let’s verify in the browser:

Screenshot showing weavesocks front end page to verify in browser

The application is accessible via the gateway. This is how you can easily set up a gateway for your application using Kuma’s built-in gateway.

In the coming section, we will be discussing policies that can be enabled in Kuma. Kuma Gateway provides the policy support for some of the policies on the builtin gateway.

Policy features of Kuma

Kuma provides policies that are combined with the data plane configuration to generate the proxy configuration.

Mutual TLS

The communication across services is not encrypted by default in the Kuma service mesh. Mutual TLS policy helps to encrypt traffic for all the services in a mesh and assign an identity to every data plane. Kuma supports both builtin and provided CA backends.

Before enabling, let us check if plain text traffic which is not encrypted is allowed by accessing the front-end service from the catalogue service’s proxy servers using the following command.

kubectl exec $(kubectl get pod -l name=catalogue -n sock-shop -o jsonpath={.items..metadata.name}) -c kuma-sidecar -n sock-shop -- wget --spider -S http://front-end/
Connecting to front-end (10.102.50.169:80)
  HTTP/1.1 200 OK
  X-Powered-By: Express
  Accept-Ranges: bytes
  Cache-Control: public, max-age=0
  Last-Modified: Tue, 21 Mar 2017 11:31:47 GMT
  ETag: W/"21f0-15af0a320b8"
  Content-Type: text/html; charset=UTF-8
  Content-Length: 8688
  Date: Thu, 22 Jun 2023 11:59:52 GMT
  Connection: close
  remote file exists

When mTLS is enabled, all traffic is denied unless a TrafficPermission policy is configured to explicitly allow traffic across proxies with encryption. So we need to verify that the TrafficPermission policy is enabled. In the Kuma service mesh, this policy is enabled by default.

$ kumactl get traffic-permissions
MESH      NAME                AGE
default   allow-all-default   2h

Use the following yaml to enable mTLS with a built-in backend.

apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
  name: default
spec:
  mtls:
    enabledBackend: ca-1
    backends:
      - name: ca-1
        type: builtin
        dpCert:
          rotation:
            expiration: 1d
        conf:
          caCert:
            RSAbits: 2048
            expiration: 10y

Kuma assigns a certificate to each data plane proxy. We have set the value as 1 day for the data plane cert expiration so that the cert gets rotated every day.

Apply this configuration:

$ kubectl apply -f mtls.yaml
mesh.kuma.io/default configured

Verify using the same command that we executed before enabling mTLS:

kubectl exec $(kubectl get pod -l name=catalogue -n sock-shop -o jsonpath={.items..metadata.name}) -c kuma-sidecar -n sock-shop -- wget --spider -S http://front-end/
Connecting to front-end (10.102.50.169:80)
wget: error getting response: Resource temporarily unavailable
command terminated with exit code 1

This is how with Kuma you can implement zero trust security in an automated way for all data plane proxies and secure all your services within the mesh.

Rate limiting

By default, there is no limit on the inbound traffic. Using this policy, a limit can be imposed on the service on how many requests are allowed in a specified time period and the response when the threshold is reached.

From Kuma’s doc, All HTTP/HTTP2 based requests are supported. So we need to ensure that the app supports the HTTP-based requests. You can set appProtocol as HTTP for the application deployment to allow HTTP requests. You can verify it from Kuma’s dashboard.

Screenshot showing verify Kuma dashboard

Use the following yaml to enable the rate limit from all services to the front-end service.

apiVersion: kuma.io/v1alpha1
kind: RateLimit
mesh: default
metadata:
  name: rate-limit-to-front-end
spec:
  sources:
    - match:
        kuma.io/service: '*'
  destinations:
    - match:
        kuma.io/service: 'front-end_sock-shop_svc_80'
  conf:
    http:
      requests: 1
      interval: 60s
      onRateLimit:
        status: 423
        headers:
          - key: "x-kuma-rate-limited"
            value: "true"
            append: true

We have added conf for HTTP-based requests to allow only 1 request to be processed by the front-end service in the duration of 60s intervals from all the services. Once the threshold is reached, the app will return the 423 status code and header as mentioned.

Before applying let us verify by making 2 requests in less than 60 secs from any of the service containers. We have logged in to the catalogue service container and sent an HTTP request to the front-end service.

$ kubectl -n sock-shop exec -it catalogue-7dc9464f59-tdgkj -- sh
/ $ wget http://front-end
Connecting to front-end (10.102.50.169:80)
wget: can't open 'index.html': Read-only file system
/ $ wget http://front-end
Connecting to front-end (10.102.50.169:80)
wget: can't open 'index.html': Read-only file system

Apply the configuration:

$ kubectl apply -f rate-limit.yaml 
ratelimit.kuma.io/rate-limit-to-front-end created

Verify with kumactl:

$ kumactl get rate-limits
MESH      NAME                      AGE
default   rate-limit-to-front-end   4m

Rate limit in action: Let us go and make 2 requests again from the catalogue container within 60 seconds.

If we look carefully at the above outputs, we can see that the first time the request was served successfully but the second time server returned an error with status code 423. This means that the rate limit is working as expected. This is how you can enable rate-limiting using Kuma.

Circuit Breaker

Instead of additional traffic, Kuma inspects existing live service traffic that is being exchanged between data plane proxies and marks unhealthy proxies based on detectors. Kuma makes use of detectors for deviation detection. Detectors include the conditions based on which Kuma decides when to open or close a circuit breaker.

In this policy, there are 5 types of detectors that can all be enabled at a time. Once any one of them is triggered because of deviations, no additional traffic can reach that service avoiding cascade failure. The duration is decided based on the value of baseEjectionTime which defaults to 30 seconds. The default circuit breaker policy is enabled by Kuma when you inject the sidecar.

You can view the existing circuit breakers using the following command:

$ kumactl get circuit-breakers -o json

For production deployments, it is advised to enable circuit breakers with detectors. We are going to enable a total error detector. Total errors help to identify the count of consecutive 5xx and locally generated errors. The default value is 5 for it.

Use the following yaml file to enable total errors and gateway errors detector with default values for the front-end service.

apiVersion: kuma.io/v1alpha1
kind: CircuitBreaker
mesh: default
metadata:
  name: circuit-breaker-example
spec:
  sources:
  - match:
      kuma.io/service: "*"
  destinations:
  - match:
      kuma.io/service: front-end_sock-shop_svc_80
  conf:
    interval: 5s
    baseEjectionTime: 30s
    maxEjectionPercent: 20
    splitExternalAndLocalErrors: false
    thresholds:
      maxConnections: 2
      maxPendingRequests: 2
      maxRequests: 2
      maxRetries: 2
    detectors:
      totalErrors:
        consecutive: 2

Once the detectors are enabled, any deviation will trigger the detectors. We have set the value for the maxConnections as 2 so if more than 2 parallel connection requests hit the frontend service, it will trip the circuit breaker. Also for the total errors, we have set the consecutive as 2. So any 2 consecutive 5xx errors will trip the circuit breaker. Once the circuit breaker is tripped, it marks the front-end DPP as unhealthy and will eject it for 30 seconds based on the baseEjectionTime value. Thus the service will be unavailable. Since the circuit breaker policy does not send additional traffic to check DPP health, it will make use of the live traffic. In every 5s interval it will check for the policy again. This is how Kuma avoids cascade failure. CircuitBreaker resources list can help you understand the details of each field.

Apply the configuration:

$ kubectl apply -f circuit-breaker.y
circuitbreaker.kuma.io/circuit-breaker-example created

Circuit breaker policy is successfully enabled with detectors in Kuma. Let us see the circuit breaker in action by tripping it. Here we have tripped the circuit breaker by sending 10 connection requests at a time to the service. Since the Kuma Gateway provides full support for the policy, let us hit the gateway IP.

$ seq 1 10 | xargs -n1 -P10 curl -I "http://192.168.49.2:30526"
HTTP/1.1 200 OK
x-powered-by: Express
accept-ranges: bytes
cache-control: public, max-age=0
last-modified: Tue, 21 Mar 2017 11:31:47 GMT
etag: W/"21f0-15af0a320b8"
content-type: text/html; charset=UTF-8
content-length: 8688
date: Wed, 28 Jun 2023 11:02:07 GMT
x-envoy-upstream-service-time: 1
server: Kuma Gateway

HTTP/1.1 200 OK
x-powered-by: Express
accept-ranges: bytes
cache-control: public, max-age=0
last-modified: Tue, 21 Mar 2017 11:31:47 GMT
etag: W/"21f0-15af0a320b8"
content-type: text/html; charset=UTF-8
content-length: 8688
date: Wed, 28 Jun 2023 11:02:07 GMT
x-envoy-upstream-service-time: 1
server: Kuma Gateway

HTTP/1.1 503 Service Unavailable
x-envoy-overloaded: true
content-length: 81
content-type: text/plain
vary: Accept-Encoding
date: Wed, 28 Jun 2023 11:02:07 GMT
server: Kuma Gateway

If we look at the output above, we can see first two requests were served and then we received 503 service-unavailable responses. This is how you can enable circuit breakers in Kuma and avoid cascade failure. The values of the errors can be adjusted as per your application’s requirements. You can refer to the policy field details of all the detectors, and their default values and can set accordingly.

Traffic Route

With the traffic route policy, you can enable L4 traffic routing using Kuma. This makes it easier to implement blue/green deployments and canary releases. This policy is enabled by default with a load balancer. The policy is applied to all the data planes in the mesh. You can view the policy manifest using the following command.

$ kumactl get traffic-routes -o json

In the default traffic route policy, Kuma uses a tag assigned to the service to apply the traffic route to all services. The routing is done in the round-robin way mentioned in the loadBalancer.

For L4 traffic split over two versions of the application, you can update the conf with weight and the destination for the matching source. Kuma also supports L7 traffic split, moderation, and rerouting. For more information on the fields to use, you can check the TrafficRoute resources list.

Fault Injection

Only setting up the guardrails is not enough unless we test the microservices against resilience. Fault injection allows us to mimic errors or faults in the infrastructure. This policy is available only for L7 HTTP traffic so we need to ensure that the destination for which we need to inject fault should have http, http2, or grpc protocol support. You can check the rate-limit section to ensure HTTP protocol support.

Kuma provides 3 HTTP faults that can be imitated:

Abort: When abort is injected, it returns a percentage of requests with only a pre-defined status code. So those are the two things you need to set.
- status code: HTTP status code that will be sent as a response to any request to the destination.
- percentage: of requests on which abort will be injected. Values are allowed from 0-100.
ResponseBandwidth limit: With this fault injected, a limit is set to the speed of responding to the request for a percentage of requests. Here are the two things you need to set:
- limit: speed of responding in gbps, mbps, kbps, or bps.
- percentage: of requests on which responseBandwidth fault will be injected. Values are allowed from 0-100.
Delay: With delay injected, the response will be delayed for the defined duration for a percentage of requests. Here are the two things you need to set.
- value: duration for which response will be delayed
- percentage: of requests on which delay will be injected. Values are allowed from 0-100.

Use the following yaml to enable fault injection policy with abort fault:

apiVersion: kuma.io/v1alpha1
kind: FaultInjection
mesh: default
metadata:
  name: fi1
spec:
  sources:
    - match:
        kuma.io/service: '*'
        kuma.io/protocol: http
  destinations:
    - match:
        kuma.io/service: front-end_sock-shop_svc_80
        kuma.io/protocol: http
  conf:        
    abort:
      httpStatus: 500
      percentage: 100

Here we have injected abort fault for all the requests by setting the percentage to 100 in case the request is sent to the front-end service from any source over HTTP protocol. The response status code will be 500.

Let us go and make a request to front-end service before enabling the fault injection policy to see what response are we getting.

$ kubectl -n sock-shop exec front-end-7d89d49d6b-8kpsp  -it  -- sh
/usr/src/app $ wget http://front-end
Connecting to front-end (10.102.50.169:80)
wget: can't open 'index.html': Read-only file system

So it is able to connect successfully.

Let us now enable the policy.

$ kubectl apply -f fi.yaml 
faultinjection.kuma.io/fi1 created

Verify with kumactl:

$ kumactl get fault-injections
MESH      NAME   AGE
default   fi1    3m

Abort fault injection in action:

$ kubectl -n sock-shop exec front-end-7d89d49d6b-8kpsp  -it  -- sh
/usr/src/app $ wget http://front-end
Connecting to front-end (10.102.50.169:80)
wget: server returned error: HTTP/1.1 500 Internal Server Error

We can see in the above output that the response is 500. It is the same as what we have set in the configuration.

This is how you can inject more errors in your microservice application and test it against resiliency.

These are just a few policies that we have enabled and shown from a beginner’s perspective. For the production-grade setup, you can configure more policies like fault injection, retry, timeout, etc. using Kuma. From version 2.0, Kuma has started beta support for targetref policies. For each source-destination policy, there is an equivalent targetref policy. The major reason to initiate the transition is the challenges with existing S&D, which makes it complex to implement.

With policies enabled, the application can be more robust without actually making any changes to the application codebase. Let us now understand how we can have monitored over a Kuma service mesh.

Observability features of Kuma: Monitoring Service Mesh

Kuma provides a built-in observability stack that helps with monitoring the service mesh. It comes with a package that can be used to enable components like metrics, logs, and traces using kumactl. Use the following command.

$ kumactl install observability | kubectl apply -f -

On successful execution, Kuma creates a namespace mesh-observability

$ kubectl -n sock-shop exec front-end-7d89d49d6b-8kpsp  -it  -- sh
/usr/src/app $ wget http://front-end
Connecting to front-end (10.102.50.169:80)
wget: server returned error: HTTP/1.1 500 Internal Server Error

The namespace spins up the following pods which can be used for monitoring traffic.

$ kubectl get svc -n mesh-observability
NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                          AGE
grafana                         ClusterIP   10.100.220.39    <none>        80/TCP                                           9m
jaeger-agent                    ClusterIP   None             <none>        5775/UDP,6831/UDP,6832/UDP,5778/TCP              9m
jaeger-collector                ClusterIP   10.104.6.217     <none>        14267/TCP,14268/TCP,9411/TCP,4317/TCP,4318/TCP   9m
jaeger-query                    ClusterIP   10.111.72.152    <none>        80/TCP                                           9m
loki                            ClusterIP   10.96.109.104    <none>        3100/TCP                                         9m
loki-headless                   ClusterIP   None             <none>        3100/TCP                                         9m
prometheus-kube-state-metrics   ClusterIP   None             <none>        80/TCP,81/TCP                                    9m
prometheus-server               ClusterIP   10.103.244.232   <none>        80/TCP                                           9m
zipkin                          ClusterIP   None             <none>        9411/TCP                                         9m

Kuma provides integration with Prometheus, Grafana, Loki, and Jaeger. In this blog, we will be talking about traffic metrics and traffic trace. You can read more about the traffic logs policy supported by Kuma.

Integration with Prometheus and Grafana

Kuma provides full integration with Prometheus by exposing metrics in Prometheus format and an API to monitor every proxy in the mesh and the application. The collected metrics can be queried and displayed using Grafana. Kuma does a basic installation and setup of Grafana with the observability package that we installed. We need to expose the metrics to enable telemetry.

Use the following yaml to expose metrics from DPP to enable the collection of metrics for all our service mesh traffic:

apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
  name: default
spec:
  metrics:
    enabledBackend: prometheus-1
    backends:
    - name: prometheus-1
      type: prometheus

Using this configuration, we are enabling the prometheus metrics backend on default Mesh Apply this configuration:

$ kubectl apply -f prom.yaml
mesh.kuma.io/default configured

Verify Prometheus dashboard using prometheus-server ClusterIP. Click on ‘Status’ and select ‘targets’ to view the endpoints. In the kuma-dataplanes, you can view the status of each data plane.

Screenshot showing verify Prometheus dashboard

Verify the Grafana dashboard using grafana ClusterIP. Enter the username as admin and password as admin. Configure the Data Sources with Prometheus type. Click on Save and test. Grafana identifies the data source and fetches the metrics. Kuma provides data sources and dashboards as Grafana extensions. We have talked about some of the extensions below:

Kuma Dataplane: helps in investigating a single data plane in the mesh. Select Kuma Dataplane in the Dashboards and in the dropdown, select the dataplane you want to view. We have selected the frontend data plane.

Screenshot showing Kuma Dataplane performance on Grafana

Kuma CP: This helps to view control plane statistics like the actual status of the service mesh, performance of the service mesh, etc. Select Kuma CP in the Dashboards to view.

Screenshot showing Kuma CP performance on Grafana

In the above graph, we can see the status of the control plane and data plane communication in terms of aggregated discovery services, health discovery services, etc.Similarly, you can view the aggregated statistics like the number of requests and error rates using Kuma Mesh and aggregated statistics of Kuma Service to Service.

Integration with Jaeger

Kuma also installs and does Jaeger’s basic setup with the observability package. Using Grafana we can visualize the traces from Jaeger. In Grafana, add Data Sources with Jaeger type and get started with creating dashboards.

Screenshot showing Response time on Grafana

Here we have selected the Jaeger data source and plotted a graph to visualize traces of the response time of each /api/service.

Similarly, we can also enable the metrics collection for the application and monitor the application traffic. With this Kuma makes it easier to enable observability and monitor the service mesh. We agree that this deployment and setup are not production ready but it helps in getting started. We can build on top easily once the basic setup is ready to use. Let us understand some more such benefits of Kuma over other service mesh.

Benefits of Kuma

In this blog, we have seen that with the help of easy-to-understand Kuma documentation, we have been able to set up Kuma, deploy an application, enforce policies that strengthen the network, and enable observability. Service mesh does not have to be complex. Since scalability is directly proportional to operational cost, too many moving parts mean a lack of clarity on how high the operational cost could be. Kuma ships most of the features via policies, provides builtin stack for observability and much more, it becomes easy to implement these features since it does not have too many moving parts.

For organizations that are still transitioning from bare metals to Kubernetes, the infrastructure could be a hybrid of both. Now with Kuma, the service mesh could still be implemented for the complete infrastructure. Also, for the different teams, multiple individual meshes can be configured with a single control plane. Not all these infrastructure components need to be geographically close. With Kuma’s multi-zone support, you can configure service mesh on multiple clouds and regions. The network can be improved by implementing easy-to-setup policies. With other service mesh, it is not so easy to achieve all this using a single solution.

Conclusion

Service mesh helps in strengthening the infrastructure without making any changes to the application. This also helps an organization shape better in terms of developer and platform engineer responsibilities. With the service mesh, there is no denying that the resource utilization increased but there are such service mesh that is easy to deploy, optimize and manage. Our service mesh experts can help you understand and choose the best solution for your production deployments.

Kuma’s easy to setup installer and kumactl CLI tool help to deploy service mesh with policies like a circuit breaker, rate limit, etc that help avoid issues in the runtime and prevent downtimes. Kuma provides observability over service mesh traffic and the application traffic that helps in reduced time to resolve production issues.

We have seen in the blog post how easy it is to get started with Kuma using an e-commerce application. In the case of production-grade deployment, it could be challenging to monitor traffic and gain fine control of the application which can be made easy with the right implementation of policies in Kuma.

I hope you found this post informative and engaging. For more posts like this one, do subscribe to our newsletter for a weekly dose of cloud native. I’d love to hear your thoughts on this post, let’s connect and start a conversation on Twitter or LinkedIn.

Hyderabad, India