TanimotoCoefficientSimilarity
is based on Tanimoto coefficient, or extended Jaccard coefficient. Tanimoto
coefficient is the ratio of the size of the intersection to the size of the
union of their preferred items. Go through following article to know about Tanimoto
coefficient.
This is used
when user don’t provide preference values.
Let’s say I
had following input data.
customer.csv
1,1 1,2 1,3 1,7 1,8 2,1 2,2 2,3 2,4 2,5 2,7 3,1 3,2 3,3 3,5 3,6 3,7 4,1 4,3 4,4 4,5 4,7 4,9 4,10 5,1 5,2 5,3 5,4 5,9
1,2 means
customer 1 like item 1.
import java.io.File; import java.io.IOException; import org.apache.mahout.cf.taste.common.TasteException; import org.apache.mahout.cf.taste.impl.model.file.FileDataModel; import org.apache.mahout.cf.taste.impl.similarity.TanimotoCoefficientSimilarity; import org.apache.mahout.cf.taste.model.DataModel; public class TanimotoCoefficientSimilarityEx { public static String dataFile = "/Users/harikrishna_gurram/customer.csv"; public static void main(String args[]) throws IOException, TasteException { DataModel model = new FileDataModel(new File(dataFile)); TanimotoCoefficientSimilarity similarity = new TanimotoCoefficientSimilarity( model); System.out.println("Similarity between user1 and user2 is " + similarity.userSimilarity(1, 2)); } }
Output
Similarity between user1 and user2 is
0.5714285714285714
No comments:
Post a Comment