Programming for beginners: Elasticsearch: Inverted Index

Inverted Index is a data structure, allows fast full text searches. Inverted index consist of list of all words appear in all documents; each word is mapped to list of documents that appears in.

For example,

Document 1 contains below data

I went down to the river. I set down on the bank.

Document 2 contains below data

I went to nail river, to meet PTR.

Inverted index for the documents 1, 2 is created like below.

Term	Document1	Document2
I	Yes	Yes
went	Yes	Yes
down	Yes	No
to	Yes	Yes
the	Yes	No
river	Yes	Yes
set	Yes	No
down	Yes	No
on	Yes	No
bank	Yes	No
nail	No	Yes
meet	No	Yes
PTR	No	Yes

Suppose if you want to search for word ’nail river’, we need to find the documents, where each term appears.

Term	Document1	Document2
nail	No	Yes
river	Yes	Yes

Both documents match, but the second document has more matches than the first. So we can say second document is more relevant to our query than first document.

Prevoius Next Home

Programming for beginners

Wednesday, 28 October 2015

Elasticsearch: Inverted Index

No comments:

Post a Comment