Guest post by:

One stop shop messaging bot for monitoring, notifying and debugging anywhere, anytime.

Bots have been around humans for a while now and used for variety of purposes. The most common ones are notification receivers through Incoming Webhooks which are legacy. 

However, there is a need for a “Super Bot” that is designed exclusively for Kubernetes and serves as a one stop shop for all the requirements. It should be modern, based on Open-Source technologies (Bot’s backend is) and not be for notifications only. It should also not be multiple Bots, each performing one task at a time at different places (channels).

After considering few options, came across a Bot that is from one of the apps on a well-known messaging platform. When we familiarized ourselves with its capabilities it simply impressed us (we will see why) and hence, called it “Super Bot. The Bot can monitor events on Kubernetes cluster(s) and notify users in real-time. It also allows debugging Kubernetes cluster(s) and enable the health checks e.g., cluster health, connectivity etc. — across multiple Kubernetes clusters (both Public and Private).

As SRE/DevOps personas we examined what this Bot really means.

Architecture 

Let us first have a quick look at its Architecture and what Kubernetes resources and commands it supports. 

Architecture diagram

As can be seen from above Architecture (high level) diagram, it has integrated technology for monitoring, notification, and execution — the very reason, we call it ‘Super Bot’. It can support multiple Kubernetes clusters, both public and private, at the same time.

The backend communicates with Kubernetes API Server to monitor Kubernetes events (note, it talks about events) and forwards them to communication mediums like Slack (or Mattermost). It also reads messages (commands) from users and sends response (output) accordingly. Backend is installed on Kubernetes cluster (s).

Use Cases  

Let us now setup the Bot and examine some of the use cases to understand its capabilities

Summary

We were delighted with our experience as DevOps and SRE personas. This Bot can run-on multiple Kubernetes clusters (we tried with 2 OpenShift ones). Bot’s backend is Open Source with community power — which is great.

It is true that we can debug anywhere, anytime with this Bot. It is indeed a one stop shop Bot for needs in several ways to take care of Kubernetes clusters i.e. events monitoring, notification, debugging and running health checks, testing reliability all in a fraction of time — in real time. Its capabilities may also be extended to enable closed loop automation for any corrective action. We could also stop/start notifications as per our choice. 

Most importantly, it enables new ways of working  

Disclaimer: The views expressed are personal ones here