Programming for beginners: Customizing Language Model Behavior with Model Parameters in LangChain4j

This post is intended for Java developers using LangChain4j who want to understand how to configure language models like Ollama to control generation behavior (e.g., creativity, response length) and connection settings (e.g., timeouts, retries).

It showcases how to use model parameters effectively to fine-tune performance and reliability of AI-powered applications.

LangChain4j offers a clean and flexible API to integrate various language models into your Java applications. When configuring a model like Ollama, it’s important to understand the available parameters that can help you:

· Control the creativity and determinism of the output

· Specify timeouts, retries, and logging behavior

· Point to the correct model and API endpoint

Below is an example of how to set up the OllamaLanguageModel with custom parameters:

OllamaLanguageModel model = OllamaLanguageModel.builder()
    .baseUrl("http://localhost:11434")     // Connect to your local Ollama instance
    .modelName("llama3.2")                 // Choose the model to use (e.g., llama3.2)
    .temperature(0.3)                      // Balance between creativity and coherence
    .maxRetries(2)                         // Retry up to 2 times on failure
    .timeout(Duration.ofMinutes(1))       // Set a timeout to avoid hanging requests
    .build();

What These Parameters Do?

· baseUrl: The URL where your model API is hosted (e.g., a local Ollama instance).

· modelName: The specific model to use, as defined in your model provider's documentation.

· temperature: Controls randomness in output. Lower values make responses more deterministic, higher values make them more creative.

· maxRetries: Number of retry attempts in case of failures like network timeouts.

· timeout: Maximum duration to wait for a model response before timing out.

For detailed parameter options, consult the official API documentation of the model provider. For Ollama, visit Ollama API Docs (https://github.com/ollama/ollama/blob/main/docs/api.md).

Find the below working application.

ModelParamsDemo.java

package com.sample.app.chatmodels;

import java.time.Duration;

import dev.langchain4j.model.ollama.OllamaLanguageModel;
import dev.langchain4j.model.output.Response;

public class ModelParamsDemo {
    public static void main(String[] args) {
        OllamaLanguageModel model = OllamaLanguageModel.builder().baseUrl("http://localhost:11434") 
                .modelName("llama3.2")
                .temperature(0.3)
                .maxRetries(2)
                .timeout(Duration.ofMinutes(1))
                .build();

        String prompt = "Tell me some Interesting Fact About LLMs in maximum 30 words";
        Response<String> response = model.generate(prompt);

        System.out.println("Response: " + response.content());
    }
}

Output

Response: LLMs (Large Language Models) can generate human-like text, answer questions, and even create original content, leveraging vast amounts of training data to learn patterns and relationships in language.

Following tables summarize various parameters of OllamaLanguageModel.

Connection & Configuration Parameters

Parameter	Type	Description
baseUrl	String	The base URL of the Ollama API endpoint (e.g., http://localhost:11434). Required to send requests to the model.
modelName	String	The name of the model to use, such as "llama3.2" or "llama4". This must match a model installed in your Ollama instance.
timeout	Duration	The maximum time to wait for a response from the Ollama server. Helps to prevent hanging requests.
maxRetries	Integer	Number of retry attempts in case of failures like timeouts or transient errors.
logRequests	Boolean	If true, logs the request sent to the model. Useful for debugging or auditing.
logResponses	Boolean	If true, logs the full response received from the model. Helpful for analysis and debugging.
customHeaders	Map<String, String>	Additional HTTP headers to send with each request. Often used for custom auth tokens or tracing headers.

Model Behavior Parameters

Parameter	Type	Description
temperature	Double	Controls randomness in output. Range is [0, 1+]. Lower = more deterministic. Higher = more creative or diverse outputs.
topK	Integer	Limits the next token selection to the top `K` tokens with the highest probability. Encourages focused output.
topP	Double	Enables nucleus sampling: chooses from the smallest possible set of tokens whose cumulative probability exceeds topP. Values near 1.0 retain more randomness.
repeatPenalty	Double	Penalizes the likelihood of tokens that have already appeared. Helps to reduce repetition. Typical values: 1.0 (no penalty), >1.0 to discourage repetition.
seed	Integer	Sets the seed for deterministic outputs. If set, same inputs will always generate the same outputs. Useful for debugging or consistent testing.
numPredict	Integer	Maximum number of tokens to generate in the response. Acts as a limit on the length of the output.
numCtx	Integer	Context window size (number of tokens considered). Typically model-dependent. Affects how much previous conversation or text is remembered.
stop	List<String>	A list of stop sequences. When any of these strings are encountered, the generation will stop. Useful for ending conversations or truncating replies at logical boundaries.

Formatting Parameters

Parameter	Type	Description
format	String	Output format, typically null or "json" depending on the model's capability. Not always used.
responseFormat	ResponseFormat (enum)	Enum to specify expected response format (e.g., raw text vs structured JSON). Internally used by LangChain4j to parse model output appropriately. Ex: TEXT, JSON

Why It Matters to finetune the Model parameters?

Setting these parameters appropriately can significantly affect:

· The quality of generated content

· The responsiveness and resilience of your app

· The cost (if using a hosted model with rate limits or billing)

In summary, LangChain4j makes it easy to plug in various models and customize them per your use case. Whether you're building chatbots, summarization tools, or code assistants, configuring the model correctly is key to delivering a smooth developer and user experience.

Previous Next Home

Programming for beginners

Saturday, 28 June 2025

Customizing Language Model Behavior with Model Parameters in LangChain4j

No comments:

Post a Comment