Wednesday, 28 October 2015

Elasticsearch: Inverted Index

Inverted Index is a data structure, allows fast full text searches. Inverted index consist of list of all words appear in all documents; each word is mapped to list of documents that appears in.
For example,
Document 1 contains below data
I went down to the river. I set down on the bank.

Document 2 contains below data
I went to nail river, to meet PTR.

Inverted index for the documents 1, 2 is created like below.
Term
Document1
Document2
I
Yes
Yes
went
Yes
Yes
down
Yes
No
to
Yes
Yes
the
Yes
No
river
Yes
Yes
set
Yes
No
down
Yes
No
on
Yes
No
bank
Yes
No
nail
No
Yes
meet
No
Yes
PTR
No
Yes

Suppose if you want to search for word ’nail river’, we need to find the documents, where each term appears.

Term
Document1
Document2
nail
No
Yes
river
Yes
Yes


Both documents match, but the second document has more matches than the first. So we can say second document is more relevant to our query than first document.



Prevoius                                                 Next                                                 Home

No comments:

Post a Comment