Guest post originally published on the Grafana Labs blog by Adam Quan

Spring Boot is a very popular microservice framework that significantly simplifies web application development by providing Java developers with a platform to get started with an auto-configurable, production-grade Spring application.

In this blog, we will walk through detailed steps on how you can observe a Spring Boot application, by instrumenting it with Prometheus and OpenTelementry and by collecting and correlating logs, metrics, and traces from the application in Grafana Cloud.

More specifically, we will:

  1. Use OpenTelemetry to instrument a simple Spring Boot application, Hello Observability, and ship the traces to Grafana Cloud using Grafana Agent.
  2. Automatically log every request to the application, and ship these logs to Grafana Cloud using Grafana Agent. Then correlate logs with traces and metrics from the application.
  3. Use Prometheus to instrument the application so that we can collect metrics, and correlate the metrics with traces leveraging Grafana’s exemplars.

Very exciting! Let me first introduce the wonderfully simple Spring Boot application, Hello Observability.

Intro to the Spring Boot application

The Hello Observability application is extremely simple. It mainly contains one Java class HelloObservabilityBootApp, which contains two methods that can serve HTTP requests.

Diagram of HTTP requests

Clone the repository to look at the code, then build the application and the application container:

git clone https://github.com/adamquan/hello-observability.git
cd hello-observability/hello-observability
./mvnw package
docker build -t hello-observability .

You can then run the application directly and access it here: http://localhost:8080/hello. Very simple application, indeed!

java -jar target/*.jar

Hello Observability UI

You can also run it inside a Docker container with the following command, and access the application here: http://localhost:8080/hello

docker run -d -p 8080:8080 --name hello-observability hello-observability

Stop the docker container using the following command:

docker stop hello-observability

We have not done any instrumentation to collect logs, metrics, and traces so far. To do so, you will have to start all services using docker-compose. All configurations needed to generate and collect logs, metrics, and traces are already configured.

Let’s now run the whole stack locally inside Docker to see it in action. The whole stack contains:

cd hello-observability/local
docker-compose up

After all the containers are up, you can access the application here: http://localhost:8080/hello,  and Grafana here: http://localhost:3000. A dashboard called Hello Observability is also pre-loaded.

Screenshot shows Grafana dashboard of Spring Boot application

Hope this gets you excited! We will do a quick introduction of Grafana Cloud and Grafana Agent next, before looking at the instrument details.

How to configure Grafana Cloud

Grafana Cloud is a fully-managed composable observability platform that integrates metrics, traces, and logs with Grafana. With Grafana Cloud, you can leverage the best open source observability software, including Prometheus, Grafana Loki, and Grafana Tempo, without the overhead of installing, maintaining, and scaling your observability stack. This allows you to focus on your data and the business insights they surface, not the infrastructure stack storing and serving the data.

Diagram of the Grafana Cloud stack

Grafana Cloud’s free plan makes it accessible for everybody. On top of being free, the plan also gives you access to advanced capabilities like Grafana OnCallSynthetic monitoring, and AlertingCreate your free account to follow along!

To send logs, metrics, and traces to Grafana Cloud, we need to gather our Grafana Cloud connection information for Loki, Prometheus, and Tempo, so that we can configure our Agent accordingly. That’s what we will do next.

Connect Grafana Cloud Traces 

Grafana Cloud Traces is the fully-managed, highly-scalable, and cost-effective distributed tracing system based on Grafana Tempo. That’s where our traces will go.

Tempo connection information includes endpoint URL and user credentials. Navigate to grafana.com and login. Make sure you select the right account from the account dropdown. My account name is aquan. Click on My Account, then Send Traces in the Tempo section. 

Screenshot shows Grafana Cloud UI to configure traces with Grafana Tempo

You should see something like the screenshot below, with information about the endpoint and user credentials. You can generate a new API Key by clicking on the Generate now link or navigate to the Security → API Keys section.

Spring Boot application in Grafana Cloud: Configure Grafana Tempo

Either way, the API Key creation screens look like below. (No worry about me exposing my API Key. It’s already deleted.)

Spring Boot application in Grafana Cloud: create API key
Spring Boot application in Grafana Cloud: API token created UI

In this case, here is the Tempo information for my Grafana Cloud account:

Connect Grafana Cloud Logs 

Grafana Cloud Logs is Grafana’s fully-managed logging solution based on Grafana Loki. Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation solution inspired by Prometheus. It is designed to be cost-effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream.

Similar as you have done for Tempo connection information, we can collect Loki connection information from the Loki section in my account page:

Connect Grafana Cloud Metrics

Finally, for Grafana Cloud Metrics, we can get all the connection information from the Prometheus section in my account page:

With all the connection information collected, we are ready to configure our Grafana Agent so that the agent can send our logs, metrics, and traces to Grafana Cloud.

How to configure the Grafana Agent

Grafana Agent is an all-in-one agent for collecting metrics, logs, and traces. It removes the need to install more than one piece of software by including common integrations for monitoring out of the box. Grafana Agent makes it easy to send telemetry data and is the preferred telemetry collector for sending metrics, logs, and trace data to the opinionated Grafana observability stack, both on-prem and in Grafana Cloud.

Spring Boot application in Grafana Cloud: diagram of how Grafana Agent works in the stack

All the agent configuration information is inside the agent.yaml file. Let’s see exactly how the agent is configured.

Configure traces

In the context of tracing, Grafana Agent is commonly used as a tracing pipeline, offloading traces from the application and forwarding them to a storage backend. The Grafana Agent supports receiving traces in multiple formats: OpenTelemetry (OTLP), Jaeger, Zipkin, and OpenCensus. 

Spring Boot application in Grafana Cloud: diagram of how Grafana Agent ingests traces

Here is the traces section from the Grafana Agent configuration file. Other than the Grafana Cloud Tempo URL and authentication information, we are basically saying the agent is expecting traces in OTLP format over the HTTP protocol. You can also configure it to use Jaeger format if you like, but we recommend the OTLP format.

traces:
  configs:
  - name: default
    remote_write:
      - endpoint: tempo-us-central1.grafana.net:443
        basic_auth:
          username: 160639
          password: <API_KEY>
    receivers:
      otlp:
        protocols:
          http:

Configure logs

The following is the clients section of the agent.yaml file, which contains all the connection information we collected before.

clients:
  - url: https://logs-prod3.grafana.net/loki/api/v1/push
    basic_auth:
      username: 164126
      password: <API_KEY>

Configure metrics

In the metrics section, remote_write contains all the connection information. The scrape_configs section defines the scrape jobs, which we will discuss later.

   remote_write:
    - basic_auth:
        username: 330312
        password: <API_KEY>
      url: https://prometheus-prod-10-prod-us-central-0.grafana.net/api/prom/push

Let’s move on and see how the instruments are done and how logs, metrics, and traces are collected. We will start with OpenTelemetry traces.

How to instrument Spring Boot with OpenTelemetry

The open source observability framework, OpenTelemetry, is gaining popularity very quickly. OpenTelemetry is the result of merging OpenTracing and OpenCensus. As of writing today, traces are the most mature component of OpenTelemetry, and logs are still experimental. We will focus on the tracing aspect of OpenTelemetry in this blog.

OpenTelemetry is a collection of APIs, SDKs, and tools that you can use to instrument, generate, collect, and export telemetry data that can in turn help you analyze your software’s performance and behavior. OpenTelemetry is both vendor and platform agnostic, which is one of the reasons it’s getting popular very quickly.

Spring Boot application in Grafana Cloud: diagram of OpenTelemetry architecture

In general, instrumentation can be done in three different places: your application code; libraries and frameworks your application depends on; and the underlying platform, such as Kubernetes, Envoy, or Istio. The nice thing about OpenTelemetry is that instrumentations are already done in many popular libraries and frameworks, such as Spring, Express, etc.

Something that is worth noting is the auto-instrumentation capability of OpenTelementry. Auto-instrumentation allows developers to collect telemetry data without making any code changes. We will be leveraging the auto-instrumentation capability of the Java Agent to instrument our Hello Observability application, with no code change necessary.

Auto-instrumentation with the Java Agent

The OpenTelemetry Java Agent can be attached to any Java 8+ application for auto-instrumentation. It dynamically injects bytecode to capture telemetry from many popular libraries and frameworks. Once attached, it automatically captures telemetry data at the edges of applications and services, such as inbound requests, outbound HTTP calls, database calls, and others. If you want to instrument application code in your applications or services in other custom ways, you will have to use manual instrumentation.

With the wonderful auto-instrumentation capability of the Java Agent, instrumenting our Spring Boot application is very straightforward. We simply need to include the -javagent:./opentelemetry-javaagent.jar option when running the application. The hello-observability application service inside cloud/docker-compose.yaml file looks like:

  hello-observability:
    image: hello-observability
    volumes:
      - ./logs/hello-observability.log:/tmp/hello-observability.log
      - ./logs/access_log.log:/tmp/access_log.log
    environment:
	JAVA_TOOL_OPTIONS: -javaagent:./opentelemetry-javaagent.jar
      OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: http://agent:4317
      OTEL_SERVICE_NAME: hello-observability
      OTEL_TRACES_EXPORTER: otlp
    ports:
      - "8080:8080"

Notice that we are passing in the Java Agent jar file with the -javaagent option, as an environment variable for the docker container. The repo already contains an up-to-date agent, as of this writing. If you want, you can also download the latest version of the agent, and copy it into the hello-observability directory.

The other three environment variables are used to tell where and how to send traces. See OpenTelemetry Java Agent Configuration for details about the system properties and environment variables that you can use to configure the agent.

While the application is running, log into your Grafana Cloud account. Select your Tempo data source from Explore. In my case, it’s called grafanacloud-aquan-traces. See your traces by clicking on the Search button and selecting hello-observability for Service Name. Remember we set OTEL_SERVICE_NAME to hello-observability?

Spring Boot application in Grafana Cloud: UI for instrumenting Java Agent with OpenTelemetry

Click on one of them to see the waterfall visualization of the trace. Later, once we have logs and metrics shipped to Grafana Cloud too, you will be able to correlate logs, metrics, and traces together! The power of a unified observability platform!

Spring Boot application in Grafana Cloud: screenshot of trace visualization process

If you are running everything locally, you will also be able to see the bonus Node Graph, which is still in beta, as of writing.

Spring Boot application in Grafana Cloud: UI for Node Graph in Grafana Cloud

This is what the node graph for the trace above looks like:

Spring Boot application for Grafana Cloud: node graph of a trace in Grafana Cloud

Collecting logs and correlating with traces

With traces coming in, let’s also collect logs from the application and see how we can correlate logs with traces. 

Spring Boot has built-in support for automatic request logging. The CommonsRequestLoggingFilter class can be used to log incoming requests. You just need to configure it by adding a bean definition. We have the following Java class definition in our source code directory, in a file called RequestLoggingFilterConfig.java.

@Configuration
public class RequestLoggingFilterConfig {
	@Bean
	public CommonsRequestLoggingFilter logFilter() {
		CommonsRequestLoggingFilter filter = 
new CommonsRequestLoggingFilter();
		filter.setIncludeQueryString(true);
		filter.setIncludeHeaders(true);
		filter.setIncludeClientInfo(true);
		return filter;
	}
}

This logging filter also requires the log level to be set to DEBUG. Here is what the application.properties file looks like:

logging.file.name=/tmp/hello-observability.log
logging.level.org.springframework=INFO
logging.level.org.springframework.web.filter.CommonsRequestLoggingFilter=DEBUG
logging.pattern.file=%d{yyyy-MM-dd HH:mm:ss} - %msg traceID=%X{trace_id} %n

We are telling Spring Boot to write logs into the /tmp/hello-observability.log log file, with information about the date/time, the log message, and the trace ID. Of particular interest is the trace_id. It allows us to correlate the logs with traces we collected before automatically, as you will see later.

If you look at the hello-observability.log log file, you should see log lines like these. Notice the trace ID?

2022-03-11 18:15:56 - After request [GET /hello, client=192.168.80.2, headers=[host:"hello-observability:8080", user-agent:"curl/7.81.0-DEV", accept:"*/*"]] traceID=f179e69bd5f342233b9bc66728901f7f

Sending the logs to Grafana Cloud using the Grafana Agent is super simple. Please refer to the documentation Collect logs with Grafana Agent for details. Here is the scrape_configs section inside the agent.yaml file for logs:

    scrape_configs:
      - job_name: hello-observability
        static_configs:
          - targets: [localhost]
            labels:
              job: hello-observability
              __path__: /tmp/hello-observability.log
      - job_name: tomcat-access
        static_configs:
          - targets: [localhost]
            labels:
              job: tomcat-access
              __path__: /tmp/access_log.log

As you can see, the Agent is reading logs from the hello-observability.log file. We are also enabling and collecting Tomcat access logs from the application, so that we can create some interesting visualizations off it. Here is the section inside application.properties that configures Tomcat access logs. We are basically writing the access logs to /tmp/access_log.log file.  That’s also where the Grafana Agent picks it up. Both the application container and the agent container can see that same file through docker volume mounts.

server.tomcat.accesslog.enabled=true
server.tomcat.accesslog.rotate=false
server.tomcat.accesslog.suffix=.log
server.tomcat.accesslog.prefix=access_log
server.tomcat.accesslog.directory=/tmp
server.tomcat.accesslog.pattern=common

Go to Grafana Cloud, then Explore, and select the Loki data source. In my case, it’s called grafanacloud-aquan-logs. Use the Log browser to select the hello-observability job. Then click on Show logs.

Spring Boot application in Grafana Cloud: UI for instrumenting logs in Spring Boot app

You should see some log lines. Expand one of the log lines and click on grafanacloud-quan-traces, you should see something like this below. The magical correlation between the logs and traces all done automatically because of the trace ID!

Spring Boot application in Grafana Cloud: UI for correlating logs and traces

Let’s now complete our observability story by collecting metrics.

Instrumenting Spring Boot with Prometheus

Grafana Cloud has a Spring Boot integration that can send Spring Boot application metrics to Grafana Cloud along with an out-of-the-box dashboard for visualization.

Spring Boot also has built-in support for metrics collection through Micrometer. However, as of today, Micrometer does not support exemplars yet. Since we want to showcase the exemplar capability of Grafana, we will use Prometheus to instrument our application directly to collect performance metrics, instead of using the Spring Boot integration. (Grafana Labs product manager Jen Villa has a nice blog post about exemplars.)

Take a look at the HelloObservabilityBootApp Java class. We are basically instrumenting our simple application with a Counter, a Gauge, a Histogram, and a Summary, solely for demonstration purposes.

Make sure the application is running. Go to Grafana Cloud, and the Hello Observability dashboard. The Requests per Minute visualization shows Prometheus metrics with exemplars, which are the green dots.

Spring Boot application in Grafana Cloud: Grafana dashboard for Prometheus metrics

Mouse over one of them, and you will see a pop-up panel with a trace_id label that has a link to take you to the trace directly. The power of exemplars!

Spring Boot application in Grafana Cloud: exemplars in Grafana Cloud

If you are curious about the metrics with exemplars collected by Prometheus, send a request to http://localhost:9090/api/v1/query_exemplars?query={job=”hello-observability”} and check it out. It will look something like the following. The exemplars are set as metadata labels, as span_id and trace_id.

{"labels":{"span_id":"405e4b409d62cd38","trace_id":"7893cdf3ce9d2401ffcee7d7f0db569d"},"value":"1.090101375","timestamp":1647142355.737}

For the exemplar pop-up dialog to show a link to Tempo, you do have to configure the Label name to be trace_id in the Prometheus data source configuration. The default is traceID. Otherwise, you will not see the link. You cannot edit the pre-configured Prometheus data source in your cloud account. You will have to create another Prometheus data source with the same connection information, then add the exemplar configuration.

Spring Boot application in Grafana Cloud: configuring exemplars

Also, since exemplars are still new, you do have to request it to be enabled for your Grafana Cloud account. If you run Grafana locally, the latest release of Grafana already supports exemplars.

Correlating logs, metrics, and traces in Grafana

Observability is not only just about collecting all the telemetry data in the form of logs, metrics, and traces. More importantly, it’s about the ability to correlate this telemetry data and derive actionable insights from it.

Spring Boot application in Grafana Cloud: diagram of observability pillars

You can obviously connect and correlate any telemetry data visually by graphing them onto the same dashboard. Import the Hello Observability dashboard from the exported cloud/dashboards/hello-observability.json file and select the corresponding Loki and Prometheus data sources. You will see the same Hello Observability dashboard showing and connecting logs, metrics, and traces together.

On top of dashboards, Grafana enables you to connect and correlate logs, metrics, and traces in many different ways:

  1. Find relevant logs from your metrics using split screen. Loki’s logs are systematically labeled in the same way as your Prometheus metrics, using the same Service Discover mechanism. This enables you to find the logs for any given metrics visualization for faster troubleshooting from one UI with just a few clicks. For example, from the Requests per Minute panel, click on Explore then Split, and select the Loki data source. You will see relevant logs for the metrics in the same time range. The logs provide the exact context for the metrics we are visualizing and investigating.
Spring Boot application in Grafana Cloud: Grafana dashboard to correlate metrics and logs
  1. Extract metrics from logs using Loki. Loki uses LogQL, a Prometheus-inspired query language to extract metrics from logs. Extracting metrics from logs and visualizing log messages in time series visualizations makes it easy and fast to get rich insight into the behavior of applications. As an example, the HTTP status codes over time panel is a metrics visualization created based on Tomcat access logs, using the following query:

sum by (status) (count_over_time( {job="tomcat-access"} | pattern<> – – <> “ <_>” != "/metrics" [1m]))

Spring Boot application in Grafana Cloud: http status codes
  1. Find traces using logs with trace ID. As you have seen, you can pivot from logs to traces easily following the trace ID link. Using logs, you can search by path, status code, latency, user, IP, or anything else you can add onto the same log line as a trace ID. For example, you can easily filter the logs down to see only the traces that failed, exceeded latency SLAs, etc. This makes it much easier to troubleshoot issues when they do occur.
  2. Find all the logs for a given span. Spans contain links to associated logs. You can explore relevant logs associated with spans with just one click.
  3. Find traces from metrics using exemplars. When traces, logs, and metrics are decorated with consistent metadata, you can create correlations that were not previously possible. As you have seen from the Latency panel, Prometheus metrics are decorated with trace ID when OpenTelemetry is also used for tracing. When you jump from an exemplar to a trace, you are now able to go directly to the logs of the struggling service!

Conclusion

Wow, that was a lot! 

But hopefully, you saw how easy it is to instrument a Spring Boot application for logs, metrics, and traces, and how Grafana’s opinionated observability stack based on Prometheus, Grafana Loki, and Grafana Tempo can help you connect and correlate your  telemetry data cohesively in Grafana Cloud. Instrument your applications and have fun dashboarding!

Grafana Cloud is the easiest way to get started with metrics, logs, traces, and dashboards. We have a generous free forever tier and plans for every use case. Sign up for free now!