Guest post originally published on Chronosphere’s blog by Rob Skillington 

With the transition from monolith to cloud-native environments, we are seeing an ongoing explosion of metrics data in terms of both volume and cardinality. This is because microservices and containerized applications generate metrics data an order of magnitude more than legacy environments. To achieve good observability in a cloud-native system, you will need to deal with large-scale data and take steps to understand and control cardinality.  

This blog explains what high cardinality in observability is, why high cardinality is a problem, and explains three ways to tame data growth and cardinality.

What is high cardinality in observability? 

Cardinality is the number of possible groupings depending on the dimensions the metrics have. Dimensions are the different properties of your data. 

When we talk about metric cardinality, we mean the number of unique time series that are produced by a combination of metric names and their associated labels [dimensions]. The total number of combinations with data that exists are cardinalities. The more combinations there are, the greater a metric’s cardinality is.

Why is high cardinality a problem? 

To get a sense of how quickly high cardinality explodes the scale of telemetry data, compare a legacy environment to a cloud-native environment and watch how quickly you end up going from 150,000 possible unique time series to 150 million!

How quickly does cardinality grow?

What is Low vs High Cardinality in metrics

You might be naturally inclined to ask at this point – what constitutes high cardinality for metrics? It turns out the answer here is somewhat relative. In the legacy environment we mentioned above, we saw that we could easily create a metric with 150,000 unique values, with individual dimensions having no more than a few hundred values. In comparison when looking at a cloud-native environment, it’s not unreasonable for us to see individual dimensions that have thousands of unique values (or more). As we saw, we were able to easily generate over 100 million unique series for a single metric! 

There is a tradeoff that takes place when more dimensions are added to metrics and cardinality goes up. The more important question to ask ourselves becomes: Is there an acceptable ROI for the dimensions we add to our metrics and the value provided by the additional cardinality?

To strike a balance between value and cardinality, we can classify our metrics and the dimensions they have into categories to help us think about the inherent tradeoffs.

Solving the high cardinality problem in three steps

It’s understandable how metrics data growth gets out of hand. There’s a lot of power in the extra level of granularity you get with the expansion of unique time series you’re storing. 

The key is understanding how high cardinality impacts the scale of the telemetry data you need to collect and finding ways to control metrics data. Here are three ways to solve the high cardinality problem:

1) Reduce tensions between too much and not enough information:

2) Don’t manage high cardinality metrics at a micro level: 

3)  Make observability a team effort: 

Monitoring and observability enable your company to operate without massive outages or huge levels of impact when you do have an outage. Your high-level of reliability is the thing that keeps your system dependable, and it’s the reason why people keep using your service. To achieve great observability, keep these three approaches to controlling high cardinality data in mind. 

Chronosphere controls high cardinality & metrics growth 

Chronosphere’s observability platform is the only purpose-built SaaS solution for scaling cloud-native environments. Chronosphere puts you back in control by taming rampant metric data growth and the high cardinality problem. Chronosphere allows customers to keep pace with the massive amounts of monitoring data generated by microservices, and it does so with more cost efficiency than legacy solutions. 

John Potocny contributed to this article.