Inverted
Index is a data structure, allows fast full text searches. Inverted index
consist of list of all words appear in all documents; each word is mapped to
list of documents that appears in.
For example,
Document 1 contains below data
I went down
to the river. I set down on the bank.
Document 2 contains below data
I went to nail
river, to meet PTR.
Inverted
index for the documents 1, 2 is created like below.
Term
|
Document1
|
Document2
|
I
|
Yes
|
Yes
|
went
|
Yes
|
Yes
|
down
|
Yes
|
No
|
to
|
Yes
|
Yes
|
the
|
Yes
|
No
|
river
|
Yes
|
Yes
|
set
|
Yes
|
No
|
down
|
Yes
|
No
|
on
|
Yes
|
No
|
bank
|
Yes
|
No
|
nail
|
No
|
Yes
|
meet
|
No
|
Yes
|
PTR
|
No
|
Yes
|
Suppose if
you want to search for word ’nail river’, we need to find the documents, where
each term appears.
Term
|
Document1
|
Document2
|
nail
|
No
|
Yes
|
river
|
Yes
|
Yes
|
Both
documents match, but the second document has more matches than the first. So we
can say second document is more relevant to our query than first document.
No comments:
Post a Comment