Retrieval-Augmented Generation (RAG) models enhance AI's capabilities by feeding responses from external knowledge sources. However, effective RAG systems depend heavily on how documents are retrieved. In this post, I demonstrate how to customize the retrieval process using EmbeddingStoreContentRetriever by fine-tuning parameters like maxResults and minScore, and seamlessly integrating it with a chat assistant powered by a language model.
Why to customize the Retriever?
When fetching documents for a query, we may want to
· Limit the number of returned results (maxResults)
· Filter out irrelevant results using a minimum similarity score (minScore)
· Using custom embedding models
This allows the system to return only the most relevant knowledge for response generation, reducing noise and improving trustworthiness.
Example
ContentRetriever contentRetriever = EmbeddingStoreContentRetriever.builder() .embeddingStore(embeddingStore) // Your store (e.g., in-memory or persistent) .embeddingModel(embeddingModel) // BGE, OpenAI, etc. .maxResults(5) // Only return top 5 similar docs .minScore(0.75) // Filter out documents below this similarity score .build(); ChatAssistant assistant = AiServices.builder(ChatAssistant.class) .chatModel(chatModel) // Your LLM (OpenAI, HuggingFace, etc.) .contentRetriever(contentRetriever) // Inject customized retriever .build();
Find the below working Application.
ChatAssistant.java
package com.sample.app.assistants; public interface ChatAssistant { String chat(String userMessage); }
InMemoryRagWithContentRetriever.java
package com.sample.app; import com.sample.app.assistants.ChatAssistant; import dev.langchain4j.data.document.Document; import dev.langchain4j.data.document.loader.FileSystemDocumentLoader; import dev.langchain4j.data.segment.TextSegment; import dev.langchain4j.model.embedding.EmbeddingModel; import dev.langchain4j.model.embedding.onnx.bgesmallenv15q.BgeSmallEnV15QuantizedEmbeddingModel; import dev.langchain4j.model.ollama.OllamaChatModel; import dev.langchain4j.rag.content.retriever.ContentRetriever; import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever; import dev.langchain4j.service.AiServices; import dev.langchain4j.store.embedding.EmbeddingStoreIngestor; import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore; import java.util.ArrayList; import java.util.List; public class InMemoryRagWithContentRetriever { public static void main(String[] args) { // Initialize local embedding model EmbeddingModel embeddingModel = new BgeSmallEnV15QuantizedEmbeddingModel(); // Load all the Documents String resourcesFolderPath = "/Users/Shared/llm_docs"; System.out.println("Resources Folder Path is " + resourcesFolderPath); List<Document> documents = FileSystemDocumentLoader.loadDocuments(resourcesFolderPath); InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); EmbeddingStoreIngestor.ingest(documents, embeddingStore); OllamaChatModel chatModel = OllamaChatModel.builder().baseUrl("http://localhost:11434").modelName("llama3.2").build(); ContentRetriever contentRetriever = EmbeddingStoreContentRetriever.builder() .embeddingStore(embeddingStore) .embeddingModel(embeddingModel) .maxResults(5) .minScore(0.75) .build(); ChatAssistant assistant = AiServices.builder(ChatAssistant.class) .chatModel(chatModel) .contentRetriever(contentRetriever) .build(); List<String> questionsToAsk = new ArrayList<>(); questionsToAsk.add("What is the tag line of ChronoCore Industries?"); long time1 = System.currentTimeMillis(); for (String question : questionsToAsk) { String answer = assistant.chat(question); System.out.println("----------------------------------------------------"); System.out.println("Q: " + question); System.out.println("A : " + answer); System.out.println("----------------------------------------------------\n"); } long time2 = System.currentTimeMillis(); System.out.println("Total time taken is " + (time2 - time1)); } }
Output
Resources Folder Path is /Users/Shared/llm_docs ---------------------------------------------------- Q: What is the tag line of ChronoCore Industries? A : I couldn't find the tagline explicitly stated in the provided information. However, I can provide you with a possible tagline based on the company's mission and values: "Preserving Yesterday. Shaping Tomorrow." This tagline is consistent with the company's mission to "unlock the fabric of time itself — responsibly, ethically, and with profound respect for the continuum that binds reality." ---------------------------------------------------- Total time taken is 4012
Previous Next Home
No comments:
Post a Comment