Member blog post by Abhishek Singh, Christos Kalkanis, Alexander Wert, and Bahubali Shetti of Elastic

In March 2023, OpenTelemetry took a big step towards this goal by merging a profiling data model OTEP and working towards a stable spec and implementation. Today OpenTelemetry profiling takes another big step in establishing profiling as the fourth telemetry signal in OpenTelemetry with the donation of Elastic’s continuous profiling agent. SREs can now benefit from these capabilities: quickly identifying performance bottlenecks, maximizing resource utilization, reducing carbon footprint, and optimizing cloud spend.

What is continuous profiling?

Profiling is a technique used to understand the behavior of a software application by collecting information about its execution. This includes tracking the duration of function calls, memory usage, CPU usage, and other system resources. 

However, traditional profiling solutions have significant drawbacks limiting adoption in production environments:

Unlike traditional profiling, which is often done only in a specific development phase or under controlled test conditions, continuous profiling runs in the background with minimal overhead. This provides real-time, actionable insights without replicating issues in separate environments. SREs, DevOps, and developers can see how code affects performance and cost, making code and infrastructure improvements easier.

Contribution of production-grade features

Elastic’s continuous profiling agent, based on eBPF, is a whole system, always-on,  solution that observes code and third-party libraries, kernel operations, and other code you don’t own. It eliminates the need for code instrumentation (run-time/bytecode), recompilation, or service restarts with low overhead, low CPU (~1%), and memory usage in production environments. 

The profiling agent facilitates identifying non-optimal code paths, uncovering “unknown unknowns”, and provides comprehensive visibility into the runtime behavior of all applications. Elastic’s continuous profiling agent provides support for a wide range of runtimes and languages, such as C/C++, Rust, Zig, Go, Java, Python, Ruby, PHP, Node.js, V8, Perl, and .NET.

Additionally, organizations can meet sustainability objectives by minimizing computational wastage, ensuring seamless alignment with their strategic ESG goals.

Benefits to the OpenTelemetry community

The contribution boosts the standardization of continuous profiling for observability and it accelerates the practical adoption of profiling as the fourth key signal in OTel. Customers now have a vendor-agnostic method to collecting profiling data and enabling correlation with existing signals, like tracing, metrics, and logs.

OTel-based continuous profiling unlocks the following possibilities for users:

With these benefits, customers can now manage the overall application’s efficiency on the cloud while ensuring their engineering teams optimize it.

Moving forward, Elastic will continue collaborating closely with the OTel Profiling and Collector SIGs to ensure seamless integration of the profiling agent within the broader OTel ecosystem.