Sunday, 3 August 2025

Using Metadata with Documents in Langchain4j

Langchain4j enables developers to associate metadata with each document, stored as key-value pairs (String keys and typed values like String, Integer, etc.). This metadata enhances document retrieval and prompt construction in powerful ways:

·      Better Prompt Construction: Include metadata like document name or source to provide the LLM with extra context.

·      Targeted Filtering: Narrow down semantic search results using metadata filters (e.g., owner, date, or document type).

·      Efficient Updates: Quickly identify and sync updated documents in your EmbeddingStore using metadata keys like “source” or “id”.

·      Whether you're building a QA bot, a document search system, or an AI assistant, using metadata ensures more context-aware and maintainable LLM applications.

 

Creating a Metadata Object

You can create a Metadata object using the following constructors:

public Metadata()
public Metadata(Map<String, ?> metadata)

Langchain4j also provides convenient static factory methods to simplify the creation.  

public static Metadata from(String key, String value)
public static Metadata from(Map<String, ?> metadata)
public static Metadata metadata(String key, String value)

These methods are useful for creating metadata with a single entry or directly from a map of values.

 

Adding Key-Value Pairs to Metadata

Yes, you can easily add new entries to an existing Metadata object using the put methods. Langchain4j supports several data types, including.

public Metadata put(String key, String value)
public Metadata put(String key, int value)
public Metadata put(String key, long value)
public Metadata put(String key, float value)
public Metadata put(String key, double value)
public Metadata put(String key, UUID value)

 

You can chain these put calls for cleaner code.

 

Adding Multiple Entries at Once

If you have a map of metadata entries, you can add all of them using the putAll method:

public Metadata putAll(Map<String, Object> metadata)

This is especially helpful when you want to merge or update metadata in bulk.

 

Find the below working application.

 

MetadataDemo.java

 

package com.sample.app.rag.apis.metadata;

import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.Metadata;

import java.util.UUID;
import java.util.HashMap;
import java.util.Map;

public class MetadataDemo {
	public static void main(String[] args) {

		// Step 1: Create metadata using static method
		Metadata metadata = Metadata.from("title", "Langchain4j Java API Reference").put("author", "Langchain Team")
				.put("version", 2).put("source", "https://docs.langchain4j.dev").put("id", UUID.randomUUID())
				.put("lastUpdated", "2025-06-01");

		// Step 2: Create a document with content and metadata
		String content = "This section covers the Metadata class and its usage in Langchain4j.";
		Document document = Document.from(content, metadata);

		// Step 3: Print document metadata
		System.out.println("Document Metadata:");
		document.metadata().toMap().forEach((k, v) -> System.out.println(k + " => " + v));

		// Step 4: Add additional metadata dynamically
		Map<String, Object> extraMeta = new HashMap<>();
		extraMeta.put("category", "Java SDK");

		document.metadata().putAll(extraMeta);

		// Print updated metadata
		System.out.println("\nUpdated Metadata:");
		document.metadata().toMap().forEach((k, v) -> System.out.println(k + " => " + v));
	}
}

Output

Document Metadata:
lastUpdated => 2025-06-01
source => https://docs.langchain4j.dev
id => 435df22d-0333-4e67-b29a-9ef17e443bb1
title => Langchain4j Java API Reference
version => 2
author => Langchain Team

Updated Metadata:
lastUpdated => 2025-06-01
author => Langchain Team
source => https://docs.langchain4j.dev
id => 435df22d-0333-4e67-b29a-9ef17e443bb1
title => Langchain4j Java API Reference
category => Java SDK
version => 2

 

Previous                                                    Next                                                    Home

No comments:

Post a Comment