Programming for beginners: Enhancing LLM Observability with Langchain4j: Tracking Requests, Responses, and Errors with ChatModelListener

In the world of AI-driven applications, visibility into what’s happening inside your language model interactions is important for debugging, monitoring, and optimizing performance. Langchain4j provides powerful observability hooks for this purpose, especially using ChatModelListener with supported ChatModel and StreamingChatModel implementations.

In this post, we’ll explore how you can use Langchain4j's observability features to gain insights into requests sent to the LLM, responses received, and any errors that occur during processing.

Why Observability Matters in LLM Applications

· Debugging unpredictable LLM behavior

· Monitoring model usage and performance

· Capturing token usage for billing insights

· Tracking errors systematically

Understanding the Events

Langchain4j captures events aligned with OpenTelemetry's Generative AI Semantic Conventions:

· Request Attributes: messages, model, temperature, top_p, max_tokens, tools, response_format, etc.

· Response Attributes: assistant_message, id, model, token_usage, finish_reason, etc.

· Error Attributes: Exception details, stack trace, model info, etc.

Introduction to ChatModelListener interface

The ChatModelListener interface is part of the LangChain4j framework and provides a way to observe and react to events that occur during interactions with a ChatModel. This is particularly useful for logging, debugging, monitoring, or tracing interactions with Language Models (LLMs).

ChatModelListener listener interface designed to hook into three key lifecycle events of a language model request:

· Before the request is sent to the LLM (onRequest)

· After the response is received from the LLM (onResponse)

· If an error occurs during processing (onError)

It provides access to rich context objects for each of these events, enabling detailed introspection and observability.

public interface ChatModelListener {
    
    default void onRequest(ChatModelRequestContext requestContext) {}
    
    default void onResponse(ChatModelResponseContext responseContext) {}
    
    default void onError(ChatModelErrorContext errorContext) {}
}

onRequest(ChatModelRequestContext requestContext)

Called before the request is sent to the model. It lets you inspect the ChatRequest, including Messages, Model name and Parameters like temperature, top-k, max tokens, etc.

onResponse(ChatModelResponseContext responseContext)

Called after a successful response is received from the LLM. It allows you to access the generated AiMessage, Metadata (model used, finish reason, token usage), the original ChatRequest, and any attributes set earlier.

onError(ChatModelErrorContext errorContext)

Called when an exception occurs during model interaction. It provides access to the thrown Throwable, the original ChatRequest etc.,

Find the below working Application.

ChatModelListenerDemo.java

package com.sample.app.observability;

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.listener.ChatModelErrorContext;
import dev.langchain4j.model.chat.listener.ChatModelListener;
import dev.langchain4j.model.chat.listener.ChatModelRequestContext;
import dev.langchain4j.model.chat.listener.ChatModelResponseContext;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.request.ChatRequestParameters;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.chat.response.ChatResponseMetadata;
import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.model.output.TokenUsage;
import java.util.List;

public class ChatModelListenerDemo {

  private static ChatModelListener listener = createListener();

  private static ChatModelListener createListener() {
    return new ChatModelListener() {

      @Override
      public void onRequest(ChatModelRequestContext requestContext) {
        System.out.println("\n--- LLM REQUEST ---");

        ChatRequest request = requestContext.chatRequest();
        ChatRequestParameters params = request.parameters();

        System.out.println("Messages:");
        request.messages().forEach(msg -> System.out.println("  " + msg));

        System.out.println("Request Parameters:");
        System.out.println("  Model: " + params.modelName());
        System.out.println("  Temperature: " + params.temperature());
        System.out.println("  Top P: " + params.topP());
        System.out.println("  Top K: " + params.topK());
        System.out.println("  Frequency Penalty: " + params.frequencyPenalty());
        System.out.println("  Presence Penalty: " + params.presencePenalty());
        System.out.println("  Max Output Tokens: " + params.maxOutputTokens());
        System.out.println("  Stop Sequences: " + params.stopSequences());
        System.out.println("  Tool Specifications: " + params.toolSpecifications());
        System.out.println("  Tool Choice: " + params.toolChoice());
        System.out.println("  Response Format: " + params.responseFormat());

        System.out.println("Model Provider: " + requestContext.modelProvider());

        requestContext.attributes().put("my-attribute", "my-value");
      }

      @Override
      public void onResponse(ChatModelResponseContext responseContext) {
        System.out.println("\n--- LLM RESPONSE ---");

        ChatResponse response = responseContext.chatResponse();
        ChatResponseMetadata metadata = response.metadata();
        TokenUsage tokenUsage = metadata.tokenUsage();

        System.out.println("Assistant Message: " + response.aiMessage());

        System.out.println("Response Metadata:");
        System.out.println("  ID: " + metadata.id());
        System.out.println("  Model: " + metadata.modelName());
        System.out.println("  Finish Reason: " + metadata.finishReason());

        System.out.println("Token Usage:");
        System.out.println("  Input Tokens: " + tokenUsage.inputTokenCount());
        System.out.println("  Output Tokens: " + tokenUsage.outputTokenCount());
        System.out.println("  Total Tokens: " + tokenUsage.totalTokenCount());

        System.out.println("Original Request: " + responseContext.chatRequest());
        System.out.println("Model Provider: " + responseContext.modelProvider());

        System.out.println("Custom Attributes:");
        System.out.println("  my-attribute: " + responseContext.attributes().get("my-attribute"));
      }

      @Override
      public void onError(ChatModelErrorContext errorContext) {
        System.out.println("\n--- LLM ERROR ---");

        System.out.println("Exception:");
        errorContext.error().printStackTrace(System.out);

        System.out.println("Related Request: " + errorContext.chatRequest());
        System.out.println("Model Provider: " + errorContext.modelProvider());

        System.out.println("Custom Attributes:");
        System.out.println("  my-attribute: " + errorContext.attributes().get("my-attribute"));
      }
    };
  }

  public static void main(String[] args) {
    // Create the Ollama model with listener
    ChatModel chatModel =
        OllamaChatModel.builder()
            .baseUrl("http://localhost:11434")
            .modelName("llama3.2")
            .listeners(List.of(listener))
            .build();

    // Build the request
    ChatRequest chatRequest =
        ChatRequest.builder()
            .messages(
                List.of(
                    UserMessage.from(
                        "What are the benefits of using AI in education, just explain in 2 lines?")))
            .temperature(0.7)
            .maxOutputTokens(200)
            .build();

    // Get response
    ChatResponse response = chatModel.chat(chatRequest);

    // Print final AI output
    System.out.println("\nFinal AI Response: " + response.aiMessage());
  }
}

Output

SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.

--- LLM REQUEST ---
Messages:
  UserMessage { name = null contents = [TextContent { text = "What are the benefits of using AI in education, just explain in 2 lines?" }] }
Request Parameters:
  Model: llama3.2
  Temperature: 0.7
  Top P: null
  Top K: null
  Frequency Penalty: null
  Presence Penalty: null
  Max Output Tokens: 200
  Stop Sequences: []
  Tool Specifications: []
  Tool Choice: null
  Response Format: null
Model Provider: OLLAMA

--- LLM RESPONSE ---
Assistant Message: AiMessage { text = "The use of AI in education offers several benefits, including personalized learning experiences, improved student engagement, and enhanced academic performance. Additionally, AI can automate administrative tasks, freeing up educators to focus on more hands-on teaching and mentoring roles." toolExecutionRequests = [] }
Response Metadata:
  ID: null
  Model: llama3.2
  Finish Reason: STOP
Token Usage:
  Input Tokens: 42
  Output Tokens: 47
  Total Tokens: 89
Original Request: ChatRequest { messages = [UserMessage { name = null contents = [TextContent { text = "What are the benefits of using AI in education, just explain in 2 lines?" }] }], parameters = OllamaChatRequestParameters{modelName="llama3.2", temperature=0.7, topP=null, topK=null, frequencyPenalty=null, presencePenalty=null, maxOutputTokens=200, stopSequences=[], toolSpecifications=[], toolChoice=null, responseFormat=null, mirostat=null, mirostatEta=null, mirostatTau=null, numCtx=null, repeatLastN=null, repeatPenalty=null, seed=null, minP=null, keepAlive=null} }
Model Provider: OLLAMA
Custom Attributes:
  my-attribute: my-value

Final AI Response: AiMessage { text = "The use of AI in education offers several benefits, including personalized learning experiences, improved student engagement, and enhanced academic performance. Additionally, AI can automate administrative tasks, freeing up educators to focus on more hands-on teaching and mentoring roles." toolExecutionRequests = [] }

Previous Next Home

Programming for beginners

Monday, 22 December 2025

Enhancing LLM Observability with Langchain4j: Tracking Requests, Responses, and Errors with ChatModelListener

No comments:

Post a Comment