Programming for beginners: Lucene: Hello World Application

In this post, I am going to explain how to add documents to Lucene index and query on specific fields of the document.

Document 1

"id": "1"
"title" : "Lucene in Action"
"description" : "Lucene is a platform where we can index our data to make it searchable."

Document 2

"id" : "2"
"title" : "Java in Action"
"description" : "Java is a platform and programming language to build Enterprise Applications"

Follow below step by step procedure to add above documents to Lucene index and query.

Procedure to add documents to Lucene index

Step 1: Define Index writer configuration.

Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);

Step 2: Define a Directory to store the documents.

Directory directory = new MMapDirectory(new File("/Users/Shared/lucene").toPath(), NoLockFactory.INSTANCE);

Step 3: Define Document instances to represent the documents to add to Lucene index.

Document doc1 = new Document();
doc1.add(new TextField("id", "1", Field.Store.YES));
doc1.add(new TextField("title", "Lucene in Action", Field.Store.YES));
doc1.add(new TextField("description", "Lucene is a platform where we can index our data to make it searchable.", Field.Store.YES));

Document doc2 = new Document();
doc2.add(new TextField("id", "2", Field.Store.YES));
doc2.add(new TextField("title", "Java in Action", Field.Store.YES));
doc2.add(new TextField("description", "Java is a platform and programming language to build Enterprise Applications", Field.Store.YES));

Step 4: Define IndexWriter and add documents to the index.

try (IndexWriter indexWriter = new IndexWriter(directory, config)) {
	indexWriter.addDocument(doc1);
	indexWriter.addDocument(doc2);
}

Procedure to query Lucene Index.

Step 1: Open directory using IndexReader.

IndexReader indexReader = DirectoryReader.open(directory)

Step 2: Define IndexSearcher using IndexReader.

IndexSearcher indexSearcher = new IndexSearcher(indexReader);

Step 3: Define Query instance to query on specific field.

QueryBuilder queryBuilder = new QueryBuilder(analyzer);
Query query = queryBuilder.createPhraseQuery("title", "Lucene", 0);

Step 4: Use search method of IndexSearcher to query Lucene index.

TopDocs docs = indexSearcher.search(query, maxHitsPerPage);
ScoreDoc[] hits = docs.scoreDocs;
System.out.print("Total Hits: " + docs.totalHits);
System.out.print("Results: ");
for (int i = 0; i < hits.length; i++) {
	Document d = indexSearcher.doc(hits[i].doc);
	System.out.println("Content: " + d.get("title"));
}

Find the below working application.

App.java

package com.sample.app;

import java.io.File;
import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.MMapDirectory;
import org.apache.lucene.store.NoLockFactory;
import org.apache.lucene.util.QueryBuilder;

public class App {

	public static void main(String args[]) throws IOException {

		Analyzer analyzer = new StandardAnalyzer();
		IndexWriterConfig config = new IndexWriterConfig(analyzer);

		Directory directory = new MMapDirectory(new File("/Users/Shared/lucene").toPath(), NoLockFactory.INSTANCE);

		Document doc1 = new Document();
		doc1.add(new TextField("id", "1", Field.Store.YES));
		doc1.add(new TextField("title", "Lucene in Action", Field.Store.YES));
		doc1.add(new TextField("description", "Lucene is a platform where we can index our data to make it searchable.",
				Field.Store.YES));

		Document doc2 = new Document();
		doc2.add(new TextField("id", "2", Field.Store.YES));
		doc2.add(new TextField("title", "Java in Action", Field.Store.YES));
		doc2.add(new TextField("description",
				"Java is a platform and programming language to build Enterprise Applications", Field.Store.YES));

		try (IndexWriter indexWriter = new IndexWriter(directory, config)) {
			indexWriter.addDocument(doc1);
			indexWriter.addDocument(doc2);
		}

		QueryBuilder queryBuilder = new QueryBuilder(analyzer);
		Query query = queryBuilder.createPhraseQuery("title", "Lucene", 0);
		int maxHitsPerPage = 10;
		
		try (IndexReader indexReader = DirectoryReader.open(directory)) {
			IndexSearcher indexSearcher = new IndexSearcher(indexReader);

			TopDocs docs = indexSearcher.search(query, maxHitsPerPage);
			ScoreDoc[] hits = docs.scoreDocs;
			System.out.println("Total Hits: " + docs.totalHits);
			System.out.println("Results: ");
			for (int i = 0; i < hits.length; i++) {
				Document d = indexSearcher.doc(hits[i].doc);
				System.out.println("Content: " + d.get("title"));
			}
		}

	}

}

Output

Total Hits: 1 hits
Results: 
Content: Lucene in Action

Previous Next Home

Programming for beginners

Wednesday, 16 June 2021

Lucene: Hello World Application

No comments:

Post a Comment