Tuesday 15 September 2015

Mahout: KnnItemBasedRecommender

Knn stands for K nearest neighbors. The weights to compute the final predicted preferences are calculated using linear interpolation, through an Optimizer. This algorithm is based in the paper of Robert M. Bell and Yehuda Koren in ICDM '07.

KnnItemBasedRecommender class provides following constructors instantiate.

KnnItemBasedRecommender(DataModel dataModel, ItemSimilarity similarity, Optimizer optimizer, CandidateItemsStrategy candidateItemsStrategy, MostSimilarItemsCandidateItemsStrategy mostSimilarItemsCandidateItemsStrategy, int neighborhoodSize)
          
KnnItemBasedRecommender(DataModel dataModel, ItemSimilarity similarity, Optimizer optimizer, int neighborhoodSize)

Following classes implements Optimizer interface.

NonNegativeQuadraticOptimizer
ConjugateGradientOptimizer

Let’s say I had following input data.

Book id
Title
1
Meet Big Brother
2
Explore the Universe
3
Memoir as metafiction
4
A child-soldier's story
5
Wicked good fun
6
The 60s kids classic
7
A short-form master
8
Go down the rabbit hole
9
Unseated a president
10
An Irish-American Memoir

User id
Name
1
Hari Krishna Gurram
2
Gopi Battu
3
Rama Krishna Gurram
4
Sudheer Ganji
5
Kiran Darsi
6
Joel Chelli
7
Sankalp Dubey
8
Sunil Kumar
9
Janaki Sriram
10
Phalgun Garimella
11
Reshmi George
12
Sailaja Navakotla
13
Aravind Phaneendra
14
Keerthi Shetty
15
Sujatha
16
Vadiraj Kulakarni
17
Arpan
18
Suprabath Bisoi
19
Sravani
20
Gireesh Amara

Following csv file contains customers purchages and their ratings on books.


customer.csv
1,1,3
1,2,1
1,4,5
1,5,3
1,9,3
1,10,2
2,1,2
2,3,2
2,4,1
2,7,5
3,1,5
3,2,1
3,3,1
3,6,1
3,8,1
4,1,1
4,2,1
4,6,3
4,7,1
4,9,2
5,2,1
5,3,3
5,6,5
5,10,3
6,1,1
6,2,4
6,3,4
6,7,2
6,8,3
7,1,3
7,2,3
7,3,1
7,5,3
7,6,3
7,7,3
8,1,1
8,3,3
8,4,5
8,8,1
8,9,2
9,4,2
9,6,5
9,8,3
9,9,3
10,2,5
10,3,1
10,4,2
10,5,1
10,9,4
11,2,3
11,4,2
11,5,2
11,8,1
12,1,1
12,3,4
12,7,3
12,8,2
13,1,3
13,2,4
13,3,2
13,5,3
13,9,3
14,2,3
14,3,2
14,5,1
14,7,1
14,8,5
14,9,2
15,1,3
15,2,2
15,3,2
15,6,5
15,7,1
15,9,3
16,2,2
16,3,4
16,6,1
16,7,3
16,10,1
17,3,1
17,4,3
17,7,4
17,8,4
18,3,3
18,5,2
18,6,3
18,9,1
18,10,2
19,1,1
19,2,5
19,6,2
19,7,2
19,8,3
19,10,3
20,1,2
20,2,2
20,3,1
20,4,4
20,8,1

20,8,1 means User20 liked item8 and given rating 1.


Following application finds recommendations for customer 4.
import java.io.File;
import java.io.IOException;
import java.util.List;

import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.recommender.knn.ConjugateGradientOptimizer;
import org.apache.mahout.cf.taste.impl.recommender.knn.KnnItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.recommender.knn.Optimizer;
import org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;

public class KnnItemBasedRecommenderEx {
 private static String input = "/Users/harikrishna_gurram/customer.csv";
 private static DataModel model = null;
 private static KnnItemBasedRecommender recommender = null;
 private static Optimizer optimizer = null;
 private static ItemSimilarity similarity = null;

 private static String[] books = { "Meet Big Brother",
   "Explore the Universe", "Memoir as metafiction",
   "A child-soldier's story", "Wicked good fun",
   "The 60s kids classic", "A short-form master",
   "Go down the rabbit hole", "Unseated a president",
   "An Irish-American Memoir" };

 private static String[] userNames = { "Hari Krishna Gurram", "Gopi Battu",
   "Rama Krishna Gurram", "Sudheer Ganji", "Kiran Darsi",
   "Joel Chelli", "Sankalp Dubey", "Sunil Kumar", "Janaki Sriram",
   "Phalgun Garimella", "Reshmi george", "Sailaja Navakotla",
   "Aravind Phaneendra", "Keerthi Shetty", "Sujatha",
   "Vadiraj Kulakarni", "Arpan", "Suprabath Bisoi", "Sravani",
   "Gireesh Amara" };

 public static void main(String args[]) throws IOException, TasteException {
  model = new FileDataModel(new File(input));
  similarity = new LogLikelihoodSimilarity(model);
  optimizer = new ConjugateGradientOptimizer();
  recommender = new KnnItemBasedRecommender(model, similarity, optimizer,
    10);

  List<RecommendedItem> recommendations = recommender.recommend(4, 5);

  System.out.println("Recommendations for customer " + userNames[3]
    + " are:");
  System.out.println("*************************************************");

  System.out.println("BookId\title\t\testimated preference");
  for (RecommendedItem recommendation : recommendations) {
   int bookId = (int) recommendation.getItemID();
   float estimatedPref = recommender.estimatePreference(1, bookId);
   System.out.println(bookId + " " + books[bookId - 1] + "\t"
     + estimatedPref);
  }

  System.out.println("*************************************************");

 }
}


Output
Recommendations for customer Sudheer Ganji are:
*************************************************
BookId itle  estimated preference
3 Memoir as metafiction NaN
*************************************************



Prevoius                                                 Next                                                 Home

No comments:

Post a Comment