Cloud Native for your database as a service

Posted on July 21, 2025 by Ravish Rathod, Principal Technology Architect, Infosys & Ekambaram Pasham, Principal Technologist, Infosys

CNCF projects highlighted in this post

Relational databases have a long history in many organizations. Relational databases are the basis for existing applications that meet current business needs. They are supported by a rich set of tools; and there is a large pool of employees qualified to implement and maintain these systems.

But organizations are increasingly considering alternatives to legacy relational infrastructure. In some cases, the motivation is:

Technical – such as a need to handle new, multi-structured data types or scale beyond the capacity constraints of existing systems.
In other cases the motivation is driven by the desire to identify viable alternatives to expensive proprietary database software and hardware.
A third motivation is agility or speed of development, as companies look to adapt to the market more quickly and embrace agile development methodologies.

Many of the new solutions to solve the new requirements have their origin at the well-known Cloud service providers. Wide varieties of Relational and NoSQL database technologies were developed in response to the demands presented in building modern applications:

Developers are working with applications that create massive volumes of new, rapidly changing data types – structured, semi-structured, unstructured, and polymorphic data.
Small teams work in agile sprints, iterating quickly and pushing code every week or two, some even multiple times every day.
Applications that once served a finite audience are now delivered as services that must be always-on, accessible from many different devices, and scaled globally to millions of users.
Organizations are now turning to scale-out architectures using open-source software, commodity servers and cloud computing instead of large monolithic servers and storage infrastructure.

Interfaces and services offered

This database solution provides db engines and storage to run applications using a database. The interfaces provided are those offered by the corresponding database product. Below section lists down solutions for each type of database variants:

DB Solution for PostgreSQL

DB Solution for PostgreSQL are defined in three variants as below:

PostgreSQL (Ephemeral)

This service provides a single PostgreSQL instance without persistent storage. Thus, it is ideal use case is where:

no need for replication
no need for high availability
a database is only necessary temporarily, e.g. during automatic tests

PostgreSQL (persistent)

The only difference to the variant described before is that the database data is stored on persistent storage. Thus, the variant could be used where there is:

no need for replication
no need for high availability
the database should be available for later reuse

PostgreSQLReplicated (Ephemeral)

This variant provides at least two instances of PostgreSQL. One is the master instance, the other runs in replication mode but without a storage. Additionally, this variant could be used:

when replication is needed
there is a need for high availability
the database is not needed for later reuse

PostgreSQLReplicated (persistent)

This variant provides at least two instances of PostgreSQL. One is the master instance, the other runs in replication mode. Database data is stored on persistent storage.

Additionally, this variant could be used to:

create and save a database for later reuse
reuse an existing database
there is a need for increasing the number of replicas

Handling NFR requirements for PostgreSQL variants

Access management for Postgres DB was integrated with LDAP directory.

All variants can be scaled up with respect to memory limit, and CPU request, persistent templates in addition for sizing of the persistent storage. Replicated template variants can be scaled using the built-in functionality of Kubernetes.

All connections to the database are configured with TLS encryption for secure transfer of data from/to server/client. For this, another CNCF tool called cert-manager is used to create and manage certificates.

Monitoring of PostgreSQL cluster is done using Prometheus and Grafana. PostgreSQL exporter was deployed alongside each PostgreSQL instance as a sidecar container. postgresql_exporter user connects to PostgreSQL database container and collects several metrics. PostgreSQL exporter exposes those metrics on /metrics endpoint. Prometheus container then scrapes /metrics endpoint of postgres_exporter container.

PostgreSQL provides High Availability through the concept of Log-Shipping Standby Servers. But this does not include automatic failover handling. This can be achieved using the Patroni framework. Patroni is used in the replicated variant provided by the database solution

Backup and restore for persistent variants are managed using WAL-G open-source framework. This framework is used for keeping the online backups and WAL file backups in the AWS-S3 as storage.

Architecture for PostgreSQL solution

Database solution for MongoDB

Solution for MongoDB is defined in four variants as below:

MongoDB (ephemeral)

This variant provides a single MongoDB instance with non-persistent storage and without replication. It uses the Wired Tiger Storage Engine.

MongoDB (persistent)

This variant provides a single MongoDB instance with persistent storage and without replication.

MongoDBReplicated (persistent)

This variant provides several MongoDB instances with persistent storage, and thus providing

redundancy and data availability
replication
automatic failover
scaled read operations

The default parameters create three MongoDB nodes:

Diagram flow showing default parameters create three MongoDB nodes

MongoDBReplicatedSharded (persistent)

The solution consists of three templates that work together to provide a replicated sharded MongoDB server.

MongoDBConfigServer (persistent)

ReplicaSet (persistent)

This variant provides a replicated config server (see orange color below). This component is used internally by the other components.

This variant provides MongoDB replicated servers with three replicas (see green color below).

ShardRouter

This variant provides a router for the shard’s servers with three replicas (see green color below).

Handling NFR requirements for MongoDB variants

MongoDB provides a role-based approach for authorizing users to access data or perform database functions. Roles consist of a set of privileges which users with a specific role have. There is a set of built-in roles as well as the possibility to define application-specific roles.

Diagram flow showing connections depiction via cert-manager

All variants can be scaled with respect to memory limit, and CPU request, persistent templates in addition for sizing of the persistent storage.

Replicated template variants can be scaled using the built-in functionality of Kubernetes. Horizontal scaling can be achieved by adding additional instances (replicas) for a component. A description on how to scale a component is provided in the usage description.

All data is stored in the MongoDB instances. By using the replicated templates additional instances can be achieved to ensure its availability even if the primary instance is not available. The number of replications can be configured in the configuration of a component described in the component’s usage description.

For monitoring services again Prometheus and Grafana are used. Prometheus is one of the most popular monitoring frameworks for Kubernetes environment. Each node running a MongoDB pod as part of a replica set, also has a sidecar container for mongodb_exporter. An exporter acts as an interface between MongoDB instance and metrics server/prometheus. This metrics server also runs in a separate pod which is available across the namespace. MongoDB exporter exposes those metrics on /metrics endpoint. Prometheus container then scrapes /metrics endpoint of mongodb_exporter container. The MongoDB exporter collects metrics from MongoDB container at regular intervals and expose them as an HTTP endpoint to be accessed and scraped by Prometheus. Following diagram shows monitoring architecture at high level.

Backup & restore for persistent variants are managed using Percona’s consistent backup solution, an open-source framework. This framework auto-discovers healthy members for backup by considering replication lag, replication ‘priority’ and by preferring ‘hidden’ members. Also creates cluster-consistent backups across many separate shards.

Database solution for Elastic

Solution for Elastic is defined in four variants as below:

Kibana (persistent)

This variant provides the GUI for Elasticsearch and functions as windows into the database.

Logstash

The Logstash image is configured to be customized with data and functionality via a repository located on a GIT server. To use it you have to provide a source repository.

Elasticsearch (ephemeral)

This variant provides an Elasticsearch instance without persistent storage.

Elasticsearch (persistent)

Elasticsearch is an open source search and analytics engine. It is based on the Apache Lucene search engine.

ElasticsearchReplicatedSharded (persistent)

This variant provides an Elasticsearch cluster with

Replication
Data + Admin + Client
Scaled Read Operations
Automatic Failover
Sharding

Handling NFR requirements for Elastic variants

All versions can be scaled with respect to memory limit, and CPU request, persistent templates in addition for sizing of the persistent storage.

Replicated template versions can be scaled using the built-in functionality of Kubernetes. Horizontal scaling can be achieved by adding additional instances (replicas) for a component.

Authorization & Authentication is pluggable in Elastic and is configured using the config file.

Database solution for Cassandra

Solution for Cassandra is defined in four variants as below:

Cassandra (ephemeral)

This variant provides the following functionality:

ACID-Compliant
Stored Procedure
Complex Transaction

CassandraReplicatedSharded (Ephemeral)

This variant provides a native cluster with replication capabilities

Replication
Sharding
No single points of failure
Constant uptime
Multiple Location

Cassandra (persistent)

This variant provides the same functionality; the only difference is the persistence of the storage for the database.

CassandraReplicatedSharded (persistent)

This variant provides a native cluster with

Replication
Sharding
No single points of failure
Constant uptime
Multiple Location

Handling NFR requirements for Cassandra variants

Authentication is pluggable in Cassandra and is configured in the configuration file. Cassandra ships with two options included in the default distribution.

By default, Cassandra is configured to run without authentication checks and therefore requires no credentials. It is used to disable authentication completely. But: using authentication is a necessary condition of Cassandra’s permissions subsystem, so if authentication is disabled, effectively so are permissions.

The default distribution includes PasswordAuthenticator, which stores encrypted credentials in a system table. This can be used to enable simple username/password authentication.

Authorization is pluggable in Cassandra and is configured using the authorizer setting in cassandra.yaml. Cassandra ships with two options included in the default distribution.

All versions can be scaled with respect to memory limit, and CPU request, persistent templates in addition for sizing of the persistent storage.

Replicated template versions can be scaled using the built-in functionality of Kubernetes. Horizontal scaling can be achieved by adding additional instances (replicas) for a component

Conclusion

In conclusion, Database-as-a-Service (DBaaS) solutions, powered by CNCF technologies like Kubernetes and Prometheus, offer a scalable and efficient way to manage databases. By leveraging PostgreSQL’s reliability and MongoDB’s agility, organizations can simplify database administration and focus on application development. Infosys, as a global digital services leader, demonstrates how enterprises can harness DBaaS with CNCF to deliver high-availability, cost-effective, and cloud-native solutions. This approach ensures rapid scalability, operational efficiency, and flexibility in modern environments.

Amsterdam, Netherlands

PostgreSQL (Ephemeral)

PostgreSQL (persistent)

PostgreSQLReplicated (Ephemeral)

PostgreSQLReplicated (persistent)

MongoDB (ephemeral)

MongoDB (persistent)

MongoDBReplicated (persistent)

MongoDBReplicatedSharded (persistent)

MongoDBConfigServer (persistent)

ReplicaSet (persistent)

ShardRouter

Kibana (persistent)

Logstash

Elasticsearch (ephemeral)

Elasticsearch (persistent)

ElasticsearchReplicatedSharded (persistent)

Cassandra (ephemeral)

CassandraReplicatedSharded (Ephemeral)

Cassandra (persistent)

CassandraReplicatedSharded (persistent)