Thursday, 10 September 2015

Mahout: CachingUserSimilarity: Compute User similarity


It is built on top of other UserSimilarity implementations, used to cache the results of computations. If you want to clear cache for given user, you can do this by using "clearCacheForUser(long userID)" method.

Let’s say I had following input data.

customer.csv
1,4,3
1,7,2
1,8,2
1,10,1
2,3,2
2,4,3
2,6,3
2,7,1
2,9,1
3,0,3
3,3,2
3,4,1
3,8,3
3,9,1
4,2,5
4,3,4
4,7,3
4,9,2
5,4,5
5,6,4
5,7,1
5,8,3
 
1,4,3 means customer 1 like item 4 and rated it 3.
import java.io.File;
import java.io.IOException;

import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.similarity.CachingUserSimilarity;
import org.apache.mahout.cf.taste.impl.similarity.CityBlockSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;

public class CachingUserSimilarityEx {
 public static String dataFile = "/Users/harikrishna_gurram/customer.csv";

 public static void main(String args[]) throws IOException, TasteException {

  DataModel model = new FileDataModel(new File(dataFile));

  CityBlockSimilarity similarity = new CityBlockSimilarity(model);

  /* Create CachingUserSimilarity on top of CityBlockSimilarity */
  CachingUserSimilarity cacheSimilarity = new CachingUserSimilarity(
    similarity, 100);

  System.out.println("Similarity between user1 and user2 is "
    + cacheSimilarity.userSimilarity(1, 2));

 }
}


Output
Similarity between user1 and user2 is 0.16666666666666666

Prevoius                                                 Next                                                 Home

No comments:

Post a Comment