Programming for beginners: Smart Query Routing in Langchain4j: From Default Strategies to LLM-Powered Decisions

In retrieval-augmented generation (RAG) systems, deciding where to send a user’s query is just as important as how to answer it. As applications scale to use multiple data sources like internal documents, product databases, chat history, or public knowledge, the system needs a smart way to route each query to the right data sources. That’s where QueryRouter in Langchain4j comes in.

This post helps you understand, how Langchain4j’s QueryRouter interface enables intelligent query routing, compares its built-in implementations, and explores how developers can use this routing logic to fit their applications.

1. What is Query Routing in RAG Systems?

Query routing refers to the process of deciding which data sources a user's query should be sent to, especially when there are multiple potential sources of information. In the context of RAG (Retrieval-Augmented Generation) systems, this becomes critical because:

· The system doesn't know the answer itself, it must retrieve supporting content from external sources before generating a response.

· These sources could be diverse, product FAQs, code documentation, user chat history, financial records, or even web search results.

· Query routing ensures that each query is routed only to the most relevant source(s), rather than flooding all sources with every question.

Example: Imagine you’re in a library that has sections for science, history, fiction, law, and art. If you ask a question like “What is Newton’s First Law?”, it would be wasteful (and confusing) if the librarian checks every section. Instead, the librarian should route your question directly to the science section. This is exactly what query routing does in a RAG system, it plays the role of that intelligent librarian.

2. Why Irrelevant Data Retrieval is a Problem?

Without proper query routing, the system might send every query to all available retrievers or databases. This creates several issues:

· Performance Overhead: More retrievers means more computation. Fetching irrelevant documents increases processing time and costs, especially if using embeddings or LLMs for retrieval.

· Latency Increase: Retrieving from every source leads to slower response times, especially problematic in real-time chat applications.

· Answer Quality Degradation" Including unrelated documents in the context window can confuse the language model, leading to hallucinations or diluted answers.

· Wasted API/Infra Costs: Many retrieval services are usage-based. Querying unnecessary retrievers drives up costs without improving results.

3. Benefits of Effective Query Routing

Smart query routing enables:

· Higher Accuracy: By routing queries only to relevant sources, the retrieved content is more likely to be useful, improving the quality of the LLM’s response.

· Lower Latency: Querying fewer retrievers leads to faster retrieval and quicker final responses.

· Scalability: As you add more data sources (e.g., documents, APIs, databases), you can continue scaling without overwhelming the system.

· Context Optimization: It avoids context window overflow and preserving space for genuinely helpful content.

· Improved User Experience: Faster, more accurate, and less noisy responses directly improve user trust and satisfaction.

4. QueryRouter Interface in Langchain4j

The QueryRouter interface in Langchain4j is designed to route a Query object to one or more ContentRetriever instances. This decision is typically based on the nature of the query, user metadata, or semantic meaning.

Here is the contract:

public interface QueryRouter {
    Collection<ContentRetriever> route(Query query);
}

When a query arrives, the route() method decides which ContentRetrievers (e.g., vector stores, database retrievers, document retrievers) should handle the query. The goal is to avoid retrieving irrelevant content and optimize the downstream generation step.

Built-in Implementations of QueryRouter in Langchain4j

Langchain4j provides two implementations of the QueryRouter interface, each designed for different stages of application complexity and scalability:

· DefaultQueryRouter: simple and suitable for most basic use cases.

· LanguageModelQueryRouter: intelligent, LLM-powered routing for dynamic, content-aware decisions.

4.1 DefaultQueryRouter: DefaultQueryRouter is the simplest and most straightforward implementation of the QueryRouter interface. It is also the default routing mechanism used internally by the DefaultRetrievalAugmentor.

public class DefaultQueryRouter implements QueryRouter {
    // routes to all configured ContentRetrievers
}

This router sends every query to all the retrievers that were provided to it during construction. It assumes all retrievers are relevant for any query, regardless of the query’s content or context.

This implementation is ideal when:

· You have a small number of retrievers (e.g., 2–3).

· All retrievers are semantically similar or overlapping.

· You are in the early stage of building your RAG system.

· Performance and routing precision are not major concerns yet.

Limitations

· May lead to performance degradation when the number of retrievers increases.

· Fetches potentially irrelevant documents, which can confuse the language model.

· Not suitable for use cases requiring domain-specific or context-aware routing.

The class documentation clearly mentions that while DefaultQueryRouter works for most use cases now, its default behavior may evolve over time if better routing logic becomes standard.

4.2 LanguageModelQueryRouter

LanguageModelQueryRouter brings intelligent, dynamic routing to the system. It uses an LLM (via a ChatModel) to decide which retriever(s) are most relevant for a given query.

public class LanguageModelQueryRouter implements QueryRouter {
    // uses an LLM to choose retrievers based on the query
}

How It Works?

Each ContentRetriever is configured with a natural language description (e.g., “Handles queries related to HR policies.”). When a query arrives, the LLM is prompted with:

· The query itself.

· Descriptions of all available retrievers.

· The model selects one or more retrievers based on semantic understanding.

This router use following prompt to find the suitable content retrievers.

   public static final PromptTemplate DEFAULT_PROMPT_TEMPLATE = PromptTemplate.from(
            """
                    Based on the user query, determine the most suitable data source(s) \
                    to retrieve relevant information from the following options:
                    {{options}}
                    It is very important that your answer consists of either a single number \
                    or multiple numbers separated by commas and nothing else!
                    User query: {{query}}"""
    );

You can even configure a custom prompt template (promptTemplate) to control how the LLM interprets and decides routing.

This is ideal when:

· Your application deals with various domains or topics.

· You need semantic-level understanding to decide retriever relevance.

· You have many retrievers, each tied to a specific dataset.

· You want to minimize noise in the LLM’s input context.

For example, if your application has following retrievers:

· HR Retriever: "Provides info on employee benefits and leave policies."

· IT Retriever: "Handles technical issues like password resets and VPN."

· Finance Retriever: "Answers queries related to reimbursements and taxes."

And the user query is: "How do I file a reimbursement for a business trip?" The LanguageModelQueryRouter will intelligently route it only to the Finance Retriever, not HR or IT.

Sample Application

Following program demonstrates how to dynamically route user queries to the most relevant content stores using LangChain4j’s LanguageModelQueryRouter.

QueryRouterDemo.java

package com.sample.app.router;

import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.embedding.onnx.bgesmallenv15q.BgeSmallEnV15QuantizedEmbeddingModel;
import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.rag.content.retriever.ContentRetriever;
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
import dev.langchain4j.rag.query.Query;
import dev.langchain4j.rag.query.router.LanguageModelQueryRouter;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class QueryRouterDemo {

  public static void main(String[] args) {

    // Initialize LLM model for routing
    OllamaChatModel chatModel =
        OllamaChatModel.builder().baseUrl("http://localhost:11434").modelName("llama3.2").build();

    // Shared embedding model
    EmbeddingModel embeddingModel = new BgeSmallEnV15QuantizedEmbeddingModel();

    // Create embedding stores
    EmbeddingStore<TextSegment> hrStore = new InMemoryEmbeddingStore<>();
    EmbeddingStore<TextSegment> itStore = new InMemoryEmbeddingStore<>();
    EmbeddingStore<TextSegment> financeStore = new InMemoryEmbeddingStore<>();

    // Create content retrievers
    ContentRetriever hrRetriever =
        EmbeddingStoreContentRetriever.builder()
            .embeddingStore(hrStore)
            .embeddingModel(embeddingModel)
            .maxResults(3)
            .minScore(0.7)
            .displayName("hrRetriever")
            .build();

    ContentRetriever itRetriever =
        EmbeddingStoreContentRetriever.builder()
            .embeddingStore(itStore)
            .embeddingModel(embeddingModel)
            .maxResults(3)
            .minScore(0.7)
            .displayName("itRetriever")
            .build();

    ContentRetriever financeRetriever =
        EmbeddingStoreContentRetriever.builder()
            .embeddingStore(financeStore)
            .embeddingModel(embeddingModel)
            .maxResults(3)
            .minScore(0.7)
            .displayName("financeRetriever")
            .build();

    Map<ContentRetriever, String> retrieverToDescription = new HashMap<>();
    retrieverToDescription.put(
        hrRetriever,
        "Provides information on leave policies, benefits, onboarding, and other HR-related topics.");
    retrieverToDescription.put(
        itRetriever,
        "Handles technical queries such as VPN access, email issues, password resets, and software installations.");
    retrieverToDescription.put(
        financeRetriever,
        "Answers questions related to reimbursements, payslips, taxation, invoices, and financial approvals.");

    // Create LLM-based router
    LanguageModelQueryRouter router =
        LanguageModelQueryRouter.builder()
            .chatModel(chatModel)
            .retrieverToDescription(retrieverToDescription)
            .build();

    // Sample queries
    List<String> queries =
        List.of(
            "How do I reset my email password?",
            "Where can I see my payslip for last month?",
            "What is the company policy on medical leave?",
            "How to claim business travel reimbursement?",
            "I'm facing issues connecting to the VPN.");

    for (String userQuery : queries) {
      Query query = Query.from(userQuery);
      Collection<ContentRetriever> retrievers = router.route(query);

      System.out.println("Query: " + userQuery);
      for (ContentRetriever retriever : retrievers) {
        System.out.println("-> Routed to: " + retriever);
      }
      System.out.println("---------------------------------------------------");
    }
  }
}

Output

Query: How do I reset my email password?
-> Routed to: EmbeddingStoreContentRetriever{displayName='itRetriever'}
---------------------------------------------------
Query: Where can I see my payslip for last month?
-> Routed to: EmbeddingStoreContentRetriever{displayName='itRetriever'}
-> Routed to: EmbeddingStoreContentRetriever{displayName='financeRetriever'}
---------------------------------------------------
Query: What is the company policy on medical leave?
-> Routed to: EmbeddingStoreContentRetriever{displayName='financeRetriever'}
---------------------------------------------------
Query: How to claim business travel reimbursement?
-> Routed to: EmbeddingStoreContentRetriever{displayName='itRetriever'}
-> Routed to: EmbeddingStoreContentRetriever{displayName='financeRetriever'}
---------------------------------------------------
Query: I'm facing issues connecting to the VPN.
-> Routed to: EmbeddingStoreContentRetriever{displayName='itRetriever'}
---------------------------------------------------

Find the below complete working application.

QueryRouterFullApp.java

package com.sample.app.router;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.embedding.onnx.bgesmallenv15q.BgeSmallEnV15QuantizedEmbeddingModel;
import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.rag.DefaultRetrievalAugmentor;
import dev.langchain4j.rag.RetrievalAugmentor;
import dev.langchain4j.rag.content.retriever.ContentRetriever;
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
import dev.langchain4j.rag.query.router.LanguageModelQueryRouter;
import dev.langchain4j.rag.query.transformer.ExpandingQueryTransformer;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;


public class QueryRouterFullApp {

  interface ChatAssistant {
    String chat(String userMessage);
  }

  public static void main(String[] args) {

    // Initialize LLM model for routing and chatting
    OllamaChatModel chatModel =
        OllamaChatModel.builder().baseUrl("http://localhost:11434").modelName("llama3.2").build();

    // Shared embedding model
    EmbeddingModel embeddingModel = new BgeSmallEnV15QuantizedEmbeddingModel();

    // Create embedding stores
    EmbeddingStore<TextSegment> hrStore = new InMemoryEmbeddingStore<>();
    EmbeddingStore<TextSegment> itStore = new InMemoryEmbeddingStore<>();
    EmbeddingStore<TextSegment> financeStore = new InMemoryEmbeddingStore<>();

    // Sample data for ingestion
    List<String> hrDocs =
        List.of(
            """
            Our HR policies include flexible work hours, medical leave, and onboarding support.
            """,
            """
            You can check your leave balance and benefits on the HR portal.
            """
        );

    List<String> itDocs =
        List.of(
            """
            If you're facing issues with VPN, restart your system and try again.
            """,
            """
            Password resets can be done via the IT Helpdesk portal.
            """
        );

    List<String> financeDocs =
        List.of(
            """
            Payslips are generated on the 5th of every month and available on the finance dashboard.
            """,
            """
            You can file business travel reimbursements through the expense portal.
            """
        );

    // Ingest documents into respective stores
    EmbeddingStoreIngestor.ingest(convertToDocuments(hrDocs), hrStore);
    EmbeddingStoreIngestor.ingest(convertToDocuments(itDocs), itStore);
    EmbeddingStoreIngestor.ingest(convertToDocuments(financeDocs), financeStore);

    // Create content retrievers
    ContentRetriever hrRetriever =
        EmbeddingStoreContentRetriever.builder()
            .embeddingStore(hrStore)
            .embeddingModel(embeddingModel)
            .maxResults(3)
            .minScore(0.7)
            .displayName("hrRetriever")
            .build();

    ContentRetriever itRetriever =
        EmbeddingStoreContentRetriever.builder()
            .embeddingStore(itStore)
            .embeddingModel(embeddingModel)
            .maxResults(3)
            .minScore(0.7)
            .displayName("itRetriever")
            .build();

    ContentRetriever financeRetriever =
        EmbeddingStoreContentRetriever.builder()
            .embeddingStore(financeStore)
            .embeddingModel(embeddingModel)
            .maxResults(3)
            .minScore(0.7)
            .displayName("financeRetriever")
            .build();

    Map<ContentRetriever, String> retrieverToDescription = new HashMap<>();
    retrieverToDescription.put(
        hrRetriever,
        "Provides information on leave policies, benefits, onboarding, and other HR-related topics.");
    retrieverToDescription.put(
        itRetriever,
        "Handles technical queries such as VPN access, email issues, password resets, and software installations.");
    retrieverToDescription.put(
        financeRetriever,
        "Answers questions related to reimbursements, payslips, taxation, invoices, and financial approvals.");

    // Create LLM-based router
    LanguageModelQueryRouter router =
        LanguageModelQueryRouter.builder()
            .chatModel(chatModel)
            .retrieverToDescription(retrieverToDescription)
            .build();

    // Expand queries (optional step for better retrieval)
    ExpandingQueryTransformer expandingQueryTransformer = new ExpandingQueryTransformer(chatModel, 3);

    // Create retrieval augmentor with dynamic contentRetriever via router
    RetrievalAugmentor retrievalAugmentor =
        DefaultRetrievalAugmentor.builder()
            .queryTransformer(expandingQueryTransformer)
            .queryRouter(router)
            .build();

    // Build the assistant
    ChatAssistant chatAssistant =
        AiServices.builder(ChatAssistant.class)
            .chatModel(chatModel)
            .retrievalAugmentor(retrievalAugmentor)
            .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
            .build();

    // Run a conversation
    List<String> userQueries =
        List.of(
            "How do I reset my email password?",
            "Where can I see my payslip for last month?",
            "What is the company policy on medical leave?",
            "How to claim business travel reimbursement?",
            "I'm facing issues connecting to the VPN."
        );

    for (String userQuery : userQueries) {
      System.out.println("User: " + userQuery);
      String answer = chatAssistant.chat(userQuery);
      System.out.println("Assistant: " + answer);
      System.out.println("-----------------------------------------------");
    }
  }

  private static List<Document> convertToDocuments(List<String> texts) {
    List<Document> documents = new ArrayList<>();
    for (String text : texts) {
      documents.add(Document.from(text));
    }
    return documents;
  }
}

Output

User: How do I reset my email password?
Assistant: To reset your email password, follow these steps:

1. Go to the IT Helpdesk portal.
2. Fill out the required details in the password reset form.
3. If you're experiencing issues with a Virtual Private Network (VPN), try restarting your system and then retrying the password reset process.

That's it!
-----------------------------------------------
User: Where can I see my payslip for last month?
Assistant: To view your payslip for last month, follow these steps:

1. Go to the finance dashboard.
2. Check if the latest month's payslip has been generated (typically available on the 5th of every month).
3. If you're experiencing issues with a VPN, restart your system and try accessing the payslip again.

Note: You can also check other HR-related information, such as leave balance and benefits, on the HR portal.
-----------------------------------------------
User: What is the company policy on medical leave?
Assistant: The company's policy on medical leave is stated in their HR policies, which include:

* Flexible work hours
* Medical leave (specific details not provided)

For more information about your specific situation, including your leave balance and benefits, you can check these details on the HR portal.

Note that none of this information provides direct guidance on the actual company policy, but rather informs where to find more information about it.
-----------------------------------------------
User: How to claim business travel reimbursement?
Assistant: To claim business travel reimbursement, follow these steps:

1. Go to the expense portal.
2. File your business travel reimbursements through this platform.

That's it! This is where you can submit your expenses for reimbursement.
-----------------------------------------------
User: I'm facing issues connecting to the VPN.
Assistant: To resolve VPN connection issues:

1. Restart your system.
2. Try reconnecting to the VPN again.

This may help resolve any temporary connectivity problems. If you continue to experience issues, you can also consider resetting your password using the IT Helpdesk portal.
-----------------------------------------------

Previous Next Home

Programming for beginners

Tuesday, 21 October 2025

Smart Query Routing in Langchain4j: From Default Strategies to LLM-Powered Decisions

No comments:

Post a Comment