Tuesday 15 September 2015

Mahout: EuclideanDistanceSimilarity : Compute item similarity

It computes similarity between users based on the Euclidean "distance" between two users X and Y. Once distance is computed, to map it inbetween (0, 1], the similarity could be computed as 1 / (1 + distance).


If p = (p1, p2,..., pn) and q = (q1, q2,..., qn) are two points, then Euclidian distance between p and q computed like following.


if p = (1, 2) and q = (3, 4) then distance is 2.8284271247461903


Let’s say I had following input data.

customer.csv
1,4,3
1,7,2
1,8,2
1,10,1
2,3,2
2,4,3
2,6,3
2,7,1
2,9,1
3,0,3
3,3,2
3,4,1
3,8,3
3,9,1
4,2,5
4,3,4
4,7,3
4,9,2
5,4,5
5,6,4
5,7,1
5,8,3


1,4,3 means customer 1 like item 4 and rated it 3.

import java.io.File;
import java.io.IOException;

import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.similarity.EuclideanDistanceSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;

public class EuclideanDistanceSimilarityEx {
 public static String dataFile = "/Users/harikrishna_gurram/customer.csv";

 public static void main(String args[]) throws IOException, TasteException {

  DataModel model = new FileDataModel(new File(dataFile));

  EuclideanDistanceSimilarity similarity = new EuclideanDistanceSimilarity(
    model);

  long itemIds[] = { 3, 4, 6, 7, 8, 9, 10 };

  double distance[] = similarity.itemSimilarities(2, itemIds);

  for (int i = 0; i < itemIds.length; i++) {
   System.out.println("distance between item 2 and " + itemIds[i]
     + " is " + distance[i]);
  }

 }
}


Output

distance between item 2 and 3 is 0.5
distance between item 2 and 4 is NaN
distance between item 2 and 6 is NaN
distance between item 2 and 7 is 0.3333333333333333
distance between item 2 and 8 is NaN
distance between item 2 and 9 is 0.25
distance between item 2 and 10 is NaN
If similarity is unknown, EuclideanDistanceSimilarity returns Double.NaN.


Prevoius                                                 Next                                                 Home

No comments:

Post a Comment