Monday 9 November 2015

Elasticsearch TF/IDF Algorithm

TF/IDF is the standard algorithm used to calculate “_score” value for documents.

TF/IDF Algorithm
TF/IDF stands for term frequency/inverse document frequency algorithm. TF/IDF considers below factors while calculating “_score” value.

a.   Term Frequency : How frequently term appeared in given field
b.   Inverse document frequency: How often does each term appear in the index, the more frequent term means less relevant. Terms that appear in many documents have a lower weight than more-uncommon terms.

c.    Field-length norm: term appeared in less length field has more weight than the term appeared in large length field.




Prevoius                                                 Next                                                 Home

No comments:

Post a Comment