Programming for beginners: LangChain4j: Chat Models Overview

ChatModels in LangChain4j are designed to handle conversational AI interactions. These models accept multiple input messages, each represented as a ChatMessage, and produce a single output message known as an AiMessage.

Typically, ChatMessage objects contain textual content, but some advanced large language models (LLMs) also support additional input types or modalities such as images, audio clips, or other data formats. For example, popular chat models like OpenAI’s gpt-4o-mini and Google’s gemini-1.5-pro fall into this category, enabling rich, multi-modal conversational experiences.

The ChatModel interface represents the fundamental, low-level API for communicating with LLMs within LangChain4j. It provides developers with the highest degree of control and flexibility when building conversational applications, allowing customization of inputs, outputs, and model behavior at a granular level.

1. Other Model Types Supported by LangChain4j

In addition to the core ChatModel, LangChain4j supports several other specialized model interfaces, each is specific for different AI tasks:

· EmbeddingModel: This model focuses on converting text into numerical embeddings. These embeddings are useful for tasks such as similarity search, clustering, and retrieval-augmented generation (RAG).

· ImageModel: Designed for visual AI tasks, the ImageModel can generate new images from textual prompts and supports editing or manipulating existing images. This is especially useful for creative applications, design, and content generation.

· ModerationModel: This model performs content moderation by analyzing text inputs and determining whether they contain harmful or inappropriate content. It helps to ensure that generated or processed text complies with safety and ethical guidelines.

· ScoringModel: The scoring model evaluates and ranks multiple pieces of text relative to a given query. It assigns relevance scores that help identify which text fragments best match the intent of the query. This functionality is particularly important for retrieval-augmented generation workflows, where selecting the most pertinent documents or data snippets is critical.

These model types will be explored in greater detail later, illustrating their specific use cases and how they integrate within the LangChain4j framework.

2. Exploring the ChatModel API in LangChain4j

Let’s now dive deeper into the ChatModel interface provided by LangChain4j, which serves as the primary way to interact with chat-based language models.

At its simplest, the interface includes a convenience method:

public interface ChatModel {
    String chat(String userMessage);
    ...
}

This method allows you to send a plain String message and receive a plain String response. It's designed for quick experimentation or simple use cases where you want to get started without dealing with more complex message structures. Internally, the String input is automatically wrapped as a UserMessage, making it easier to prototype interactions.

Advanced chat Methods

For more control and flexibility, the ChatModel interface also provides overloaded chat methods that accept structured message inputs.

public interface ChatModel {
    ChatResponse chat(ChatRequest chatRequest);
    ChatResponse chat(ChatMessage... messages);
    ChatResponse chat(List<ChatMessage> messages);
}

These methods allow you to pass one or more ChatMessage objects, which is the core abstraction representing different types of messages in a conversation (e.g., UserMessage, AiMessage, SystemMessage, etc.). Using these methods enables richer conversational flows, such as maintaining dialogue history or incorporating system-level instructions.

Customizing Chat Behavior with ChatRequest

For scenarios where you want full control over the behavior of the model, LangChain4j provides the chat(ChatRequest) method. This method allows you to customize several parameters that affect how the model generates responses. Some of the options you can configure include:

· Model Name: Specify which underlying LLM to use (e.g., gpt-4, gemini-1.5-pro).

· Temperature: Controls the randomness of the output; higher values produce more creative responses.

· Tools: Provide external tools the model can call during the chat.

· Frequency Penalty: Discourage repetition by penalizing frequent tokens.

· Tool Choice: Direct the model to use a specific tool, if applicable.

· Response Format: Specify output format preferences (e.g., JSON, plain text).

· Max Output Tokens: Limit the maximum number of tokens in the response.

By using ChatRequest, developers can finely tune how the model behaves to meet a wide range of application requirements, from casual chatbots to structured question-answering systems or tool-augmented agents.

How to Obtain a ChatModel Instance?

To interact with a chat-based language model like llama3.2 hosted on your local Ollama server, you need to first create a ChatModel instance. This model acts as the interface to communicate with the underlying LLM.

ChatModel chatModel = OllamaChatModel.builder()
    .baseUrl("http://localhost:11434")  // URL of the locally running Ollama server
    .modelName("llama3.2")              // Name of the model you want to use
    .build();

This code initializes a ChatModel that connects to Ollama and uses the specified model. You can use this model instance to send chat messages and receive AI-generated responses.

How to Construct a ChatRequest?

Once you have a ChatModel, you need to prepare a ChatRequest. This request object defines what message you’re sending and how the model should behave when generating the response.

ChatRequest chatRequest = ChatRequest.builder()
    .messages(...)                  // One or more ChatMessages (e.g., UserMessage, AiMessage, etc.)
    .modelName(...)                 // Optional: specify the model name
    .temperature(...)               // Controls randomness in responses (0 = deterministic, 1 = creative)
    .topP(...)                      // Nucleus sampling, filters tokens with low probability
    .topK(...)                      // Top-K sampling, considers only the top K likely tokens
    .frequencyPenalty(...)          // Reduces repeated tokens
    .presencePenalty(...)           // Encourages introducing new tokens
    .maxOutputTokens(...)           // Limits the number of output tokens
    .stopSequences(...)             // Sequences that tell the model when to stop generating
    .toolSpecifications(...)        // Define tools the model can call during execution
    .toolChoice(...)                // Specify which tool to use if multiple are available
    .responseFormat(...)            // Control the format of the model output (e.g., JSON, plain text)
    .parameters(...)                // Optionally pass a map of general or provider-specific parameters
    .build();

This structure gives you complete control over the interaction with the model.

Here’s a practical example of creating a ChatRequest that sends a question to the model:

ChatRequest chatRequest = ChatRequest.builder()
    .messages(Collections.singletonList(
        UserMessage.from("What are the benefits of using AI in education?")
    ))
    .temperature(0.7)
    .maxOutputTokens(200)
    .build();

In this example:

· We're sending a single user message.

· We set temperature to 0.7 for a balance between creativity and control.

· We limit the output to 200 tokens.

This request can now be passed to the chatModel.chat(chatRequest) method to get a response from the model.

Find the below working application.

ChatModelHelloWorld.java

package com.sample.app.chatmodels;

import java.util.Collections;

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.ollama.OllamaChatModel;

public class ChatModelHelloWorld {

    public static void main(String[] args) {
        // Create the Ollama language model
        ChatModel chatModel = OllamaChatModel.builder().baseUrl("http://localhost:11434").modelName("llama3.2").build();

        // Build a ChatRequest with custom parameters
        ChatRequest chatRequest = ChatRequest.builder()
                .messages(
                        Collections.singletonList(UserMessage.from("What are the benefits of using AI in education?")))
                .temperature(0.7).maxOutputTokens(200).build();

        // Send the request and receive the response
        ChatResponse chatResponse = chatModel.chat(chatRequest);

        // Print the AI response
        System.out.println("AI Response: " + chatResponse.aiMessage());
    }
}

Output

AI Response: AiMessage { text = "The benefits of using Artificial Intelligence (AI) in education are numerous and can have a significant impact on the learning experience. Some of the key advantages include:

1. **Personalized Learning**: AI can help create personalized learning plans tailored to individual students' needs, abilities, and learning styles.
2. **Intelligent Tutoring Systems**: AI-powered tutoring systems can provide one-on-one support to students, offering real-time feedback and guidance.
3. **Automated Grading**: AI can automate the grading process, freeing up instructors to focus on more hands-on, human-centered aspects of teaching.
4. **Data-Driven Insights**: AI can analyze vast amounts of data from student performance, helping educators identify areas where students need extra support.
5. **Enhanced Accessibility**: AI-powered tools can provide accommodations for students with disabilities, such as text-to-speech software or speech recognition technology.
6. **Improved Engagement**: AI-driven educational content can be designed to be more engaging and interactive, increasing student motivation" toolExecutionRequests = [] }

Previous Next Home

Programming for beginners

Monday, 23 June 2025

LangChain4j: Chat Models Overview

No comments:

Post a Comment