Tuesday, 10 June 2025

How to expose Summary metric using Prometheus Java Client?

A Summary is a Prometheus metric type that is used to:

 

·      Track individual observations (like request duration, response size, etc.)

·      Automatically calculate quantiles (e.g., 50th percentile, 90th percentile) over time

·      Track total sum and count of observations

 

Summary metric gives you:

·      count: number of times the event occurred

·      sum: total time/size spent, which you can use to calculate average

·      Optional: quantiles (but not always reliable in distributed environments)

 

When to use Summary metrics?

Use Summary when you care about:

 

·      Average response times

·      Latency of certain operations

·      Time spent in GC

·      How long a query or DB operation takes on an average

 

How to Define a Summary Object?

Summary queryExecutionDuration = Summary.build()
.name("query_execution_duration_seconds")
.help("Duration of query execution in seconds")
.quantile(0.5, 0.05)
.quantile(0.9, 0.01)
.quantile(0.99, 0.001)
.register();

Above statement create a new Summary metric object, with name "query_execution_duration_seconds". Here the name follows the convetion includes unit (seconds) and purpose.

.quantile(0.5, 0.05)
.quantile(0.9, 0.01)
.quantile(0.99, 0.001)

Above lines enable quantile calculations with a tolerated error.

A percentile is a measure used in statistics to indicate the value below which a given percentage of observations in a group of observations falls. For example, the 50th percentile (also known as the median) is the value below which 50% of the observations may be found.

 

Let’s say over time, the app records these query durations (in seconds):

0.1, 0.2, 0.15, 0.25, 0.5, 0.8, 1.2, 0.3, 0.05, 0.4

 

Quantile calculations:

Sort the values in ascending order

0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.4, 0.5, 0.8, 1.2

 

There are 10 values here. N = 10

 

50th percentile:

Index = (P / 100) * N = (50 / 100) * 10 = 5

 

Since it's an odd integer, take the average of 5th and 6th values:

 

·      5th = 0.25

·      6th = 0.3

 

Median = (0.25 + 0.3) / 2 = 0.275

 

90th Percentile

Index = (90 / 100) * 10 = 9

 

9th and 10th values are:

 

·      9th = 0.8

·      10th = 1.2

 

90th Percentile = (0.8 + 1.2) / 2 = 1.0

 

99th Percentile

Index = (99 / 100) * 10 = 9.9

 

Not an integer, round up to 10th position. 10th = 1.2

 

99th Percentile = 1.2

 

Find the below working Application.

 

SummaryMetricDemo.java

package com.sample.app;

import com.sun.net.httpserver.HttpServer;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpExchange;
import io.prometheus.client.Summary;
import io.prometheus.client.exporter.common.TextFormat;
import io.prometheus.client.CollectorRegistry;

import java.io.OutputStream;
import java.io.StringWriter;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.util.Enumeration;
import java.util.Random;

public class SummaryMetricDemo {

      // Summary metric for query execution duration
      private static final Summary queryExecutionDuration = Summary.build().name("query_execution_duration_seconds")
                  .help("Duration of query execution in seconds").quantile(0.5, 0.05).quantile(0.9, 0.01)
                  .quantile(0.99, 0.001).register();

      public static void main(String[] args) throws IOException {
            HttpServer server = HttpServer.create(new InetSocketAddress(8080), 0);

            server.createContext("/simulate-query", exchange -> {
                  Summary.Timer timer = queryExecutionDuration.startTimer();
                  try {
                        // Simulate query taking 100–500 ms
                        Thread.sleep(new Random().nextInt(400) + 100);
                        respond(exchange, "Query executed!");
                  } catch (Exception e) {
                        e.printStackTrace();
                  } finally {
                        timer.observeDuration(); // Record time
                  }
            });

            server.createContext("/metrics", new MetricsHandler());

            server.setExecutor(null);
            server.start();
            System.out.println("Server started at http://localhost:8080");
            System.out.println("Hit /simulate-query to simulate DB call and track execution time.");
      }

      private static void respond(HttpExchange exchange, String message) throws IOException {
            exchange.sendResponseHeaders(200, message.length());
            OutputStream os = exchange.getResponseBody();
            os.write(message.getBytes());
            os.close();
      }

      static class MetricsHandler implements HttpHandler {
            @Override
            public void handle(HttpExchange exchange) throws IOException {
                  StringWriter writer = new StringWriter();
                  Enumeration<io.prometheus.client.Collector.MetricFamilySamples> samples = CollectorRegistry.defaultRegistry
                              .metricFamilySamples();
                  TextFormat.write004(writer, samples);

                  byte[] response = writer.toString().getBytes();
                  exchange.getResponseHeaders().set("Content-Type", TextFormat.CONTENT_TYPE_004);
                  exchange.sendResponseHeaders(200, response.length);
                  try (OutputStream os = exchange.getResponseBody()) {
                        os.write(response);
                  }
            }
      }
}

Run above Application.

 

You can see below messages in the console.

 

Server started at http://localhost:8080

Hit /simulate-query to simulate DB call and track execution time.

 

Open the url ‘http://localhost:8080/metrics’ in browser.

 


Fire the url http://localhost:8080/simulate-query some times, say for more than 15 times.

 

Fire ‘http://localhost:8080/metrics’ api again in browser.

 


Metrics Explanation

query_execution_duration_seconds{quantile="0.5",} 0.264421917

50% of the queries were executed in ≤ 0.264421917 seconds

 

query_execution_duration_seconds{quantile="0.9",} 0.474577083

90% of the queries were executed in ≤ 0.474577083 seconds

 

query_execution_duration_seconds{quantile="0.99",} 0.477281333

99% of the queries were executed in ≤ 0.477281333 seconds

 

These quantiles are approximations, not exact values — they’re maintained via a sampling algorithm under the hood by Prometheus client libraries.

 

query_execution_duration_seconds_count 15.0

This shows the total number of observations (query executions).

 

query_execution_duration_seconds_sum 4.252432249

This is the total sum of all recorded durations.

 

You can calculate the average execution time by:

average = sum / count = 4.252432249 / 15 ≈ 0.2835 seconds

 

In summary,

·      You’ve recorded 15 query executions so far.

·      On average, a query takes ~0.2835 seconds.

·      50% of queries finish in under 0.2644s,

·      90% in under 0.4746s,

·      99% in under 0.4773s.

Note

·      Quantiles are calculated on the client side, so they don’t aggregate well across instances.

·      The second argument (error) should be small for precise quantiles (like 0.01 for 1%).

 


 

Previous                                                    Next                                                    Home

No comments:

Post a Comment